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To the memory of my father 


I will never believe that God plays dice with the universe. 

Albert Einstein 


Then they gave lots to them, and the lot fell upon Matthias, 
and he was counted with the eleven apostles. 


Acts 1: 26 


Preface 


The main intended audience for this book is undergraduate students in pure and 
applied sciences, especially those in engineering. Chapters 2 to 4 cover the probability 
theory they generally need in their training. Although the treatment of the subject is 
surely sufficient for non-mathematicians, I intentionally avoided getting too much into 
detail. For instance, topics such as mixed type random variables and the Dirac delta 
function are only briefly mentioned. 

Courses on probability theory are often considered difficult. However, after having 
taught this subject for many years, I have come to the conclusion that one of the biggest 
problems that the students face when they try to learn probability theory, particularly 
nowadays, is their deficiencies in basic differential and integral calculus. Integration by 
parts, for example, is often already forgotten by the students when they take a course 
on probability. For this reason, I have decided to write a chapter reviewing the basic 
elements of differential calculus. Even though this chapter might not be covered in class, 
the students can refer to it when needed. In this chapter, an effort was made to give the 
readers a good idea of the use in probability theory of the concepts they should already 
know. 

Chapter 2 presents the main results of what is known as elementary probability , 
including Bayes’ rule and elements of combinatorial analysis. Although these notions 
are not mathematically complicated, it is often a chapter that the students find hard 
to master. There is no trick other than doing a lot of exercises to become comfortable 
with this material. 

Chapter 3 is devoted to the more technical subject of random variables. All the 
important models for the applications, such as the binomial and normal distributions, 
are introduced. In general, the students do better when examined on this subject and 
feel that their work is more rewarded than in the case of combinatorial analysis, in 
particular. 

Random vectors, including the all-important central limit theorem, constitute the 
subject of Chapter 4. I have endeavored to present the material as simply as possible. 
Nevertheless, it is obvious that double integrals cannot be simpler than single integrals. 

Applications of Chapters 2 to 4 are presented in Chapters 5 to 7. First, Chapter 5 is 
devoted to the important subject of reliability theory, which is used in most engineering 
disciplines, in particular in mechanical engineering. Next, the basic queueing models are 
studied in Chapter 6. Queueing theory is needed for many computer science engineering 
students, as well as for those in industrial engineering. Finally, the last application 
considered, in Chapter 7, is the concept of time series. Civil engineers, notably those 
specialized in hydrology, make use of stochastic processes of this type when they want 
to model various phenomena and forecast the future values of a given variable, such as 
the flow of a river. Time series are also widely used in economy and finance to represent 
the variations of certain indices. 


Preface VII 


No matter the level and the background of the students taking a course on probability 
theory, one thing is always true: as mentioned above, they must try to solve many 
exercises before they can feel that they have mastered the theory. To this end, the book 
contains more than 400 exercises, many of which are multiple part questions. At the 
end of each chapter, the reader will find some solved exercises, whose solutions can be 
found in Appendix C, followed by a large number of unsolved exercises. Answers to the 
even-numbered questions are provided in Appendix D at the end of the book. There are 
also many multiple choice questions, whose answers are given in Appendix E. 

It is my pleasure to thank all the people I worked with over the years at the Ecole 
Polytechnique de Montreal and who provided me with interesting exercises that were 
included in this work. 

Finally, I wish to express my gratitude to Vaishali Damle, and the entire publishing 
team at Springer, for their excellent support throughout this book project. 


Mario Lefebvre 
Montreal, July 2008 
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Review of differential calculus 


This chapter presents the main elements of differential calculus needed in probabil¬ 
ity theory. Often, students taking a course on probability theory have problems with 
concepts such as integrals and infinite series. In particular, the integration by parts 
technique is recalled. 


1.1 Limits and continuity 

The first concept that we recall is that of the limit of a function, which is defined formally 
as follows. 

Definition 1.1.1. Let f be a real-valued function. We say that f(x ) tends to fo (e 
as x tends to xo if for any positive number e there exists a positive number S such that 

0 < |x — a; 0 | <6 => \f(x) - f 0 \ < e. 

We write that liirLc-^ f(x) = fo. That is, fo is the limit of the function f{x) as x 
tends to x$. 

Remarks, (i) The function f(x) need not be defined at the point xo for the limit to exist. 

(ii) It is possible that f(x o) exists, but f(x o) ^ fo. 

(iii) We write that linx^^ f(x) = oo if, for any M > 0 (as large as we want), there 
exists a S > 0 such that 

0 < \x — xq\ < 5 => f(x) > M. 

Similarly for linr^^ f(x) = — oo. 

(iv) In the definition, xo is assumed to be a real number. However, the definition can 
actually be extended to the case when xq = Too. 


M. Lefebvre, Basic Probability Theory with Applications , Springer Undergraduate Texts in Mathematics 
and Technology, DOI: 10.1007/978-0-387-74995-2_l, 
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1 Review of differential calculus 


Sometimes, we are interested in the limit of the function f(x) as x decreases or 
increases to a given real number x$. The right-hand limit (resp., left-hand limit) of the 
function f{x) as x decreases (resp., increases) to xq is denoted by lim^o f{x) [resp., 
lim x]x 0 f(x)\. Some authors write 1 im x ^ x + f(x) [resp., lim x ^ x - f{x)\. If the limit of 
f(x) as x tends to xo exists, then 

lim f[x) = lim f(x) = lim f(x). 

x[xq X^Xq X^-Xq 


Definition 1.1.2. The real-valued function f(x) is said to be continuous at the point 
xo G M if (i) it is defined at this point, (ii) the limit as x tends to xo exists, and (Hi) 
lim^^^Q f{x) = f{x o). If f is continuous at every point xo G [a, b] [or (a, b), etc.], then 
f is said to be continuous on this interval. 


Remarks, (i) In this textbook, a closed interval is denoted by [a, 6], whereas (a, b) is an 
open interval. We also have, of course, the intervals [a, b) and (a, b\. 

(ii) If we rather write, in the definition, that the limit lim^i^ f(x) [resp., hm^^ f(x)\ 
exists and is equal to f{x o), then the function is said to be right-continuous (resp., left- 
continuous) at xo- A function that is continuous at a given point xo such that a < xo < b 
is both right-continuous and left-continuous at that point. 

(iii) A function / is said to be piecewise continuous on an interval [a, b] if this interval 
can be divided into a finite number of subintervals on which / is continuous and has 
right- and left-hand limits. 

(iv) Let f{x) and g{x) be two real-valued functions. The composition of the two functions 
is denoted by g o / and is defined by 

= g[f(x)}- 


A result used in Chapter 3 states that the composition of two continuous functions is 
itself a continuous function. 

Example 1 . 1 . 1 . Consider the function 


u 



0 if x < 0, 
1 if x > 0, 


(i.i) 


which is known as the Heaviside or unit step function. In probability, this function 
corresponds to the distribution function of the constant 1. It is also used to indicate 
that the possible values of a certain random variable are the set of nonnegative real 
numbers. For example, writing that 


fx(x) = e X u{x) for all x G M 


is tantamount to writing that 


1.2 Derivatives 
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n / \ f 0 if x < 0, 

/*W - |e- x if x > 0, 

where fx(x) is called the density function of the random variable X. 

The function u(x) is defined for all x G M. In other contexts, u( 0) is chosen differently 
as above. For instance, in some applications u( 0) = 1/2. At any rate, the unit step 
function is continuous everywhere, except at the origin, because (for us) 

\imu(x) = 1 and lim u(x) = 0. 
xio £c|0 


However, with the choice u( 0) = 1, we may assert that u(x) is right-hand continuous at 
x = 0. 


The previous definitions can be extended to the case of real-valued functions of two 
(or more) variables. In particular, the function f(x,y) is continuous at the point (xo,yo) 
if 

lim f(x,y) = /( lim x, lim y). 
x^x 0 x ^ xo y^y° 

y ^yo 


This formula implies that the function f(x,y) is defined at (xo,yo) and that the limit 
of f(x,y) as (x,y) tends to (xq,2/o) exists and is equal to f(xo,yo)- 


1.2 Derivatives 


Definition 1.2.1. Suppose that the function f{x) is defined at xq G (a, b). If 
f'M := lim /(X) - /(Xn) s lim /(x ° + 


x^XQ X — Xq 


Ax 


exists, we say that the function f{x) is differentiable at the point xo and that f'(x o) 
is the derivative of f(x) (with respect to x) at x q. 


Remarks, (i) For the function f(x) to be differentiable at xq, it must at least be con¬ 
tinuous at that point. However, this condition is not sufficient, as can be seen in the 
example below. 

(ii) If the limit is taken as x j xo (resp., x | xq) in the previous definition, then the 
result (if the limit exists) is called the right-hand (resp., left-hand) derivative of f(x) at 
xo and is sometimes denoted by f'{xo) [resp., f'(x q )]. If f'(x o) exists, then /'(rrj) = 
/'(%)• 

(iii) The derivative of / at an arbitrary point x is also denoted by ^f(x), or by Df(x). 
If we set y = f(x), then 
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1 Review of differential calculus 


(iv) If we differentiate /'(#), we obtain the second-order derivative of the function /, 

denoted by f"(x) or ^sf(x). Similarly, f"'(x) [or / (3) ( x ), or is the third-order 

derivative of /, and so on. 

(v) One way to find the values of x that maximize or minimize the function f{pc) is 
to calculate the first-order derivative f'(x) and to solve the equation f'(x) = 0. If 
f'(x o) = 0 and f"(x o) < 0 [resp., f"(x o) > 0], then / has a relative maximum (resp., 
minimum) at x == xq. If f'(x ) ^ 0 for all x G M, we can check whether the function f(x) 
is always increasing or decreasing in the interval of interest. 

Example 1.2.1. The function f(x) = \x\ is continuous everywhere, but is not differen¬ 
tiable at the origin, because we find that 

f\ x o) = 1 and f( x o) = - 1 - 

Example 1.2.2. The function u(x) defined in Example 1.1.1 is obviously not differen¬ 
tiable at x = 0, because it is not continuous at that point. 

Example 1.2.3. The function 


T 0 if x < 0, 

Fx{x) = < x if 0 < x < 1, 

I 1 if x > 1 


is defined and continuous everywhere. It is also differentiable everywhere, except at 
x = 0 and x — 1. We obtain that the derivative F’x (x) of Fx{x ), which is denoted by 
fx{%) in probability theory, is given by 


flif0<x<l, 
0 elsewhere. 


Note that fx{%) is discontinuous at x = 0 and x = 1. The function Fx{x) is an example 
of what is known in probability as a distribution function. 

Remark. Using the theory of distributions , we may write that the derivative of the 
Heaviside function u(x) is the Dirac delta function S(x) defined by 


S(x) 


0 if x ^ 0, 
oc if x = 0. 


( 1 . 2 ) 


The Dirac delta function is actually a generalized function. It is, by definition, such that 



dx = 1. 


We also have, if f{x) is continuous at x = xq, that 


1.2 Derivatives 
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x 0 )dx = f(x 0 ). 


We do not recall the basic differentiation rules, except that for the derivative of a 
quotient 

d f(x) g(x)f'(x) - f(x)g'(x) . f ( s 
dxg{x) g 2 (x) 5U ^ 

Remark. Note that this formula can also be obtained by differentiating the product 
f(x)h(x), where h(x) := l/g(x). 

Likewise, the formulas for calculating the derivatives of elementary functions are 
assumed to be known. However, a formula that is worth recalling is the so-called chain 
rule for derivatives. 

Proposition 1.2.1. (Chain rule) Let the real-valued function h{x) be the composite 
function (g o f){x). If f is differentiable at x and g is differentiable at the point f(x), 
then h is also differentiable at x and 

h'{x) = g'[f{x)}f\x). 

Remark. If we set y = f(x), then the chain rule may be written as 

t s{v)= Ty Siv) 'tx =S ' [y)nx) - 


Example 1.2.4. Consider the function h(x) = \/x 2 + 1. We may write that h(x) = 
( g o f)(x), with f(x) = x 2 + 1 and g(x) = y/x. We have: 


g\x) = 


Then, 


2^/x 
f'(x) = 2x = 


9'[f( x )] = 


h\x) = 


2Vc 2 + 1 


2y/J{x) 2^/PTT■ 


(2x) = 


sfx 2 + 1 


Finally, another useful result is known as VHospitaVs rule. 

Proposition 1.2.2. (L’Hospital’s rule) Suppose that 

lim f(x) = lim g{x) = 0 


X 


lim f(x) = lim g(x) = ± 00 . 

X—>Xq X^-Xq 


or that 
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If (i) f(x) and g(x) are differentiable in an interval (a, b ) containing the point xo, except 
perhaps at xq, and (ii) the function g(x) is such that g'(x) 0 for all x x$ in the 

interval (a, b), then 


lim 

X^-Xq 


fix) 

g(x) 


fix) 


lim H \ 

x^xq g'[x) 


holds. If the functions f'(x) and g'(x) satisfy the same conditions as f{x) and g{x), we 
can repeat the process. Moreover, the constant xq may be equal to Too. 


Remark. If xo = a or 6, we can replace the limit as x —> xq by limx { a or limx } b , 
respectively. 

Example 1.2.5. In probability theory, one way of defining the density function fx{%) of 
a continuous random variable X is by calculating the limit of the ratio of the probability 
that X takes on a value in a small interval about x to the length of this interval: 

Probability that X E [x — (e/2),x + (e/2)] 
fx (x) := lim-7-• 

The probability in question is actually equal to zero (in the limit as e decreases to 
0). For example, we might have that {Probability that X E [x — (e/2),x + (e/2)]} = 
exp{— x + (e/2)} — exp{— x — (e/2)}, for x > 0 and e small enough. 

By making use of l’Hospital’s rule (with e as variable), we find that 

fxi x ) := lim expj-a; + (e/2)} - expj-a: - (e/2)} 

= ] . m exp{-x + (e/2)}(l/2) - exp{-a; - (e/2)>(—1/2) 
efO 1 

= exp{— x} for x > 0. 


In two dimensions, we define the partial derivative of the function f(x,y) with respect 
to x by 

9 . . f{x + Ax, y) - fix, y) 


f(x, y) = f x (x,y) = lim 
dx J\ ,yj Jx\ ,yj Ax ^ Q 


Ax 


when the limit exists. The partial derivative of f{x,y) with respect to y is defined 
similarly. Note that even if the partial derivatives f x {%o,yo) and f y (xo,yo) exist, the 
function f(x,y) is not necessarily continuous at the point (xo,yo). Indeed, the limit of 
the function f(x,y) must not depend on the way (x,y) tends to (xo,yo). For instance, 
let 


fix,y) = < 


xy 


r 


if ix,y) f (0,0), 


(1.3) 


0 if ix, y) = (0,0). 

Suppose that (x, y) tends to (0, 0) along the line y = kx , where k 0. We then have: 










1.3 Integrals 
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lim 

x —> 0 

y -> 0 


/(a;,y) = lim 


x(kx) 


x 2 + (fcx ) 2 


k 

1 + fc 2 


0 ). 


Therefore, we must conclude that the function /(x, y) is discontinuous at (0, 0). However, 
we can show that f x ( 0 , 0) and f y ( 0 , 0) both exist (and are equal to 0). 


1.3 Integrals 

Definition 1.3.1. Let f(x) be a continuous (or piecewise continuous) function on the 
interval [a, b] and let 

n 

I = ^/(CfcX X k -x k -i), 

k =1 

where fk G [xk~i,Xk\ for all k and a = xq < x\ < • • • < x n _i < x n = b is a partition of 
the interval [a, b\. The limit of I as n tends to oo, and Xk — %k-i decreases to 0 for all 
k, exists and is called the (definite) integral of f(x) over the interval [a, b\. We write: 

[ b 

lim 1=1 f{x) dx. 

n —> oo J a 

(x k - Xk- 1 ) | 0 Vfc 


Remarks, (i) The limit must not depend on the choice of the partition of the interval 
[a, b]. 

(ii) The function f{x) is called the integrand. 

(iii) If f{x) > 0 in the interval [a, 6], then the integral of f(x) over [a, b] gives the area 
between the curve y = f(x) and the x-axis from a to b. 

(iv) If the interval [a, b] is replaced by an infinite interval, or if the function f{x) is not 
defined or not bounded at at least one point in [a, 6], then the integral is called improper. 
For example, we define the integral of f{x) over the interval [a, oo) as follows: 

nOO rb 

/ f{x) dx = lim / f{x) dx. 

Ja b ^°° Ja 

If the limit exists, the improper integral is said to be convergent ; otherwise, it is diver¬ 
gent. When [a, b] is the entire real line, we should write that 

/ oo pO pb 

f(x) dx = lim / f(x) dx + lim / f{x) dx 
-oo a^-oo J a b^oo J 0 


and not 
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I\ = lim 


im / f(x) 

°° J-b 


dx. 


This last integral is actually the Cauchy principal value of the integral I\. The Cauchy 
principal value may exist even if I\ does not. 

Definition 1.3.2. Let f(x) be a real-valued function. Any function F(x) such that 
F'(x) = f(x) is called a primitive (or indefinite integral or antiderivative^) of 

f(x)- 

Theorem 1.3.1. (Fundamental Theorem of Calculus) Let f(x) be a continuous 
function on the interval [a, b\ and let F{x) be a primitive of f(x). Then, 


f 

J a 


f(x) dx = F(b) — F(a). 

a 

Example 1.3.1. The function 

fx(x) = 7 ' 2 . for x G 

7T(1 + X Z ) 

is the density function of a particular Cauchy distribution. To obtain the average value 
of the random variable X , we can calculate the improper integral 


1 


/ °° r°° x 

.J fx{x)dx = J-„WT^) dx - 


A primitive of the integrand g{x) := xfx{x) is 

G(x) = T\n(l+x 2 ). 

We find that the improper integral diverges, because 

r 0 


lim [ g{x) dx = lim [G(0) — G(a)\ = —oo 

a—^ — oo J Q a—» — oo 


and 


lim 

6—>• oo 


[ h 9(x) 

Jo 


dx 


lim [G(b) — G(0)] = oo. 

6—»• oo 


Because oo — oo is indeterminate, the integral indeed diverges. However, the Cauchy 
principal value of the integral is 

lim [ g{pc) dx = lim [G(b) — G(—b)] = lim 0 = 0. 

6—> oo J _ij b^oo b^oo 


There are many results on integrals that could be recalled. We limit ourselves to 
mentioning a couple of techniques that are helpful to find indefinite integrals or evaluate 
definite integrals. 




1.3 Integrals 


9 


1.3.1 Particular integration techniques 


(1) First, we remind the reader of the technique known as integration by substitution. 
Let x = g(y). We can write that 


J f{x)dx = J f[g(y)]g'(y) dy. 

This result actually follows from the chain rule. In the case of a definite integral, as¬ 
suming that 

(i) f(x) is continuous on the interval [a, 6], 

(ii) the inverse function g~ l {x) exists and 

(iii) g'(y) is continuous on [g _1 (a), g _1 (6)] (resp., [g _1 (fr), g~ 1 (a)]) if g(y) is an increasing 
(resp., decreasing) function, 

we have: 

[ f{x)dx= I f\g(y)]g'(y)dy, 

J a J C 

where a = g(c) <t=> c = g~ 1 (a) and b = g{d) ^ d = ^ _1 (6). 

Example 1.3.2. Suppose that we want to evaluate the definite integral 

I 2 ■■= [ x~ 1/2 e xl/2 dx. 

J 0 


Making the substitution x = ^(t/) = y 2 , so that y — g 1 (x) = x 1 / 2 (for x G [0,4]), we 
can write that 


f 


x-V 2 e* 112 dx y= = 


: T 1/2 / 

Jo 


y l e v 2ydy = 2e y 


2 


= 2(e 2 -l). 

0 


Remark. If we are only looking for a primitive of the function /(#), then after having 
found a primitive of f\g(y)\g'(y) we must replace y by g~ x {x) (assuming it exists) in 
the function obtained. Thus, in the above example, we have: 


/■ 


y 1 e y 2y dy = 2e v = 2e : 


a/2 


(2) Next, a very useful integration technique is based on the formula for the derivative 
of a product: 

~^f(x)g(x) = f(x)g(x) + f(x)g'(x). 

Integrating both sides of the previous equation, we obtain: 
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f{x)g(x) = J f(x)g(x)dx + j f(x)g'(x)dx 


<*=>■ J f(x)g'(x) dx = f(x)g(x) - I f(x)g(x)dx. 
Setting u = f{x) and v = g(x), we can write that 


/ 


udv = uv— 


/ 


v du. 


This technique is known as integration by parts. It is often used in probability to 
calculate the moments of a random variable. 


Example 1.3.3. To obtain the expected value of the square of a standard normal random 
variable , we can try to evaluate the integral 

/ oo 

cx 2 e~ x / 2 dx , 

-oo 


where c is a positive constant. When applying the integration by parts technique, one 
must decide which part of the integrand to assign to u. Of course, u should be chosen 
so that the new (indefinite) integral is easier to find. In the case of Is, we set 

u = cx and dv = xe~ x / 2 dx. 


Because 


it follows that 


= dvm 


/■ 


5-^/2 dx = _ e -* 2 /2 


h = cx{-e~ x2/2 ) 


— OO J —oo 


f 


^ 2 di 


The constant c is such that the above improper integral is equal to 1 (see Chapter 3). 
Furthermore, making use of V Hospital’s rule, we find that 


lim xe x I 2 — lim 


x —>oo g 


x 2 /2 


lim 


1 


oo g^ 2 /2 . . 


0 . 


Similarly, 


lim xe~ x2/2 =0. 

x —►—OO 


Hence, we may write that Is = 0 + 1 = 1. 

Remarks, (i) Note that there is no elementary function that is a primitive of the func¬ 
tion e~ x / 2 . Therefore, we could not have obtained a primitive (in terms of elementary 
functions) of x 2 e~ x / 2 by proceeding as above. 
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(ii) If we set u = ce x21 ' 2 (and dv = x 2 dx ) instead, then the resulting integral will be 
more complicated. Indeed, we would have: 


r -x 2 /2 X 

h = ce x /2 — 


L 


c d e -* 2 / 2 dx= r c ^ e -xv 2 d , 

-oo 3 Joe 3 


,x. 


A particular improper integral that is important in probability theory is defined by 

/ oo 

e^ x f{x)dx, 

-OO 

where j := and cj is a real parameter, and where we assume that the real-valued 
function f{x) is absolutely integrable ; that is, 


f 


\f(x)\dx < oo. 


The function F(uj) is called the Fourier transform (up to a constant factor) of the 
function f(x). It can be shown that 


1 f c 


o~JUX 


F(uj)duj. 


We say that f(x) is the inverse Fourier transform of F{uS). 

In probability theory, a density function fx{%) is, by definition, nonnegative and 
such that 


/ 

j —c 


fx{x)dx = 1. 


Hence, the function F{uS) is well defined. In this context, it is known as the characteristic 
function of the random variable X and is often denoted by Cx(u). By differentiating 
it, we can obtain the moments of X (generally more easily than by performing the 
appropriate integrals). 

Example 1.3.4. The characteristic function of a standard normal random variable is 

Cx(v) = ^ 2 - 

We can show that the expected value of the square of X is given by 

,- 2 / 2 = _ e - 2 / 2 (cc; 2 _ 1} 


d 2 


d^ 6 


J=0 


= 1, 


uj=0 


which agrees with the result found in the previous example. 
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Finally, to obtain the moments of a random variable X, we can also use its moment¬ 
generating function , which, in the continuous case , is defined (if the integral exists) 
by 



e tx f x (x)dx, 


where we may assume that t is a real parameter. When X is nonnegative, the moment¬ 
generating function is actually the Laplace transform of the function fx{x). 


1.3.2 Double integrals 

In Chapter 4, on random vectors , we have to deal with double integrals to obtain various 
quantities of interest. A joint density function is a nonnegative function fx,Y(%, y) such 
that 

POO POO 


/ OO POO 

/ fx,Y( x ,y)dxdy = 

-OO J —OO 


Double integrals are needed, in general, to calculate the probability P[A] that the 
continuous random vector (X, Y) takes on a value in a given subset A of M 2 . We have: 


P[A] = II fx,Y{x 1 y)dxdy. 

A 

One has to describe the region A with the help of functions of x or of y (whichever is 
easier or more appropriate in the problem considered). That is, 

pb rg 2 (x) pb ( rg 2 {x) ] 

P[A] = / / /x,y{x, y)dydx = l fx,v(x,y)dy > dx (1.4) 

da Jgi(x) da I dg 1 ( x ) 


n h 2 (y ) rd ( rh 2 (y) ^ 

fx,v{x,y)dxdy = { fx,y(x,y)dx > dy. (1.5) 

i (y) d c yd hx{y) J 

Remarks, (i) If the functions gi(x) and g2(x) [or h\(y) and ft^O/)] are constants, and if 
the function /x,y(^?2/) can be written as 


fx,v(x,y) = fx(x)f Y (y), 

then the double integral giving P[A\ can be expressed as a product of single integrals: 

p [ A \ = [ fx{x)dx f f Y (y)dy , 

J a J a 


where a = gi(x) \/x and /3 = g 2 {%) Vx. 
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(ii) Conversely, we can write the product of two single integrals as a double integral: 

rb rd nb rd 

/ f(x)dx g(y)dy= / f(x)g{y)dydx. 

J a J c J a J c 

Example 1.3.5. Consider the function 
fx,v(x,y) = j' 


e x y if x > 0 and y > 0, 
0 elsewhere 


(see Figure 1.1). To obtain the probability that (X, Y) takes on a value in the region 



Fig. 1.1. Joint density function in Example 1.3.5. 


A := {(#, y) G M 2 : x > 0, y > 0,0 < 2x + y < 1} 
(see Figure 1.2), we calculate the double integral 


rl/2 nl-2x nl/2 

P[A\ = / / e~ x ~ y dydx = 

Jo Jo Jo 


y=l-2x' 

y =0 y 


dx 


, 1/2 

= / {e~ x - e*- 1 } dx =-e~ x - e*- 1 

Jo 

= — e“ 1/2 - e“ 1/2 + 1 + e _1 ~ 0.1548. 
We may also write that 

,1 f (l-y)/2 


1/2 


P[A] =11 

Jo Jo 


e~ x ~ y dxdy = 


f 


—e 


-x-y 


x=(l-y)/2 y 

x=0 


dy 


= J je y — e ( 1+ y)/ 2 | dy = — 


e~ y +2e~ {1+y)/2 


-1 


+ 2e -1 + 1 - 2e _1/2 ~ 0.1548. 


0 
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Remark. In this example, the functions gi(x) and hi(y ), i = 1,2, that appear in (1.4) 
and (1.5) are given by 


9i( x ) — 0, g 2 (x) = l- 2x, /ii(y) = 0 and h 2 (y) = (1 - y)/2. 


If B is the rectangle defined by 

B = {(x, y)GM 2 :0<x<l,0<|/<2}, 

then we have: 

P[B\= [ [ e~ x - y dxdy= [ e~ x dx [ e~ y dy 

J 0 Jo Jo Jo 


— p y 


= (1 -e“ 1 )(l -e~ 2 ) ~ 0.5466. 


1.4 Infinite series 

Let ai, a 2 ,..., a n ,... be an infinite sequence of real numbers, where a n is given by a 
certain formula or rule, for example, 

a n = for n = 1 , 2 ,... . 

ra+l 

We denote the infinite sequence by {a n }^ =1 or simply by {a n }. An infinite sequence is 
said to be convergent if lim n ^ 00 a n exists; otherwise, it is divergent. 

Next, from the sequence {a n }™ =1 we define a new infinite sequence by 

S 1 = ttl, S 2 = CL\ + (22^ • • • 5 S n = <2l + (22 + ‘ ‘ + a n , ... . 



Fig. 1.2. Integration region in Example 1.3.5. 
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Definition 1.4.1. The infinite sequence Si, S 2 , • • • 5 S n ,... is represented by Y^Li a n 
and is called an infinite series. Moreover, S n := J2k =1 a k is called the nth partial 
sum of the series. Finally, if the limit Hindoo S n exists (resp., does not exist), we say 
that the series is convergent (resp., divergent ,). 

In probability, the set of possible values of a discrete random variable X may be 
finite or countably infinite (see p. 27). In the latter case, the probability function px of 
X is such that 

00 

5 ~^px(x k ) = 1, 
k=l 

where X\, X 2 ,... are the possible values of X. In the most important cases, the possible 
values of X are actually the integers 0,1,... . 

1.4.1 Geometric series 

A particular type of infinite series encountered in Chapter 3 is known as a geometric 
series. These series are of the form 

00 00 

(1.6) 

n =1 n =0 


^ar"- 1 or ^ar n , 


where a and r are real constants. 


Proposition 1.4.1. If \r\ < 1, the geometric series S(a,r) := Y^=o a r n converges to 
a/(l — r). If \r\ > 1 (and a 0), then the series is divergent. 

To prove the above results, we simply have to consider the nth partial sum of the 
series: 

n 

S n = ^ ar k_1 = a + ar + ar 2 + • • • + ar n_1 . 


k=l 


We have: 

so that 


S n — rS n = a — ar 
Hence, we deduce that 


rS n = ar + ar 2 + ar 3 + • • • + ar n 1 + ar n , 
r#i c _ a(l — r n ) 


1 — r 


S := lim S n = 

n—> 00 


1^7 if IH<r 

does not exist if \r\ > 1. 


If r = 1, we have that 5 n = na, so that the series diverges (if a / 0). Finally, we can 
show that the series is also divergent if r = — 1. 
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Other useful formulas connected with geometric series are the following: if \r\ < 1, 
then 

oo 

ar 


E 

k=l 


ar 


k 


1 — r 


and 


J"akr k = 

^ h —' 


k—0 


(1 — r) 2 ' 

In probability, we also use power series , that is, series of the form 
S(x) := ag + a\x + a2X 2 + • • • + a n x n + • ■ •, 


(1.7) 


where is a constant, for k = 0,1... .In particular, we use the fact that it is possible 
to express functions, for instance, the exponential function e cx , as a power series: 

e cx = l+cx+ < -:X 2 + ■■■+'—:X n + ■■■ for all x G R. (1.8) 

2! n\ 


This power series is called the series expansion of e cx . 

Remark. Note that a geometric series 5(a, r) is a power series S(r) for which all the 
constants a & are equal to a. 

In general, a power series converges for all values of x in an interval around 0. If 
a given series expansion is valid for \x\ < R (> 0), we say that R is the radius of 
convergence of the series. For \x\ < R , the series can be differentiated and integrated 
term by term: 

S\x) = a\ T e la 2 X T • • • T na n x n ^ T • • • (1.9) 


and 


l 


x ^ x n ~ 

S(t) dt = a 0 x + CLi — H-1 - a n —— 

2 n + 1 


+ 


The interval of convergence of a power series having a radius of convergence R > 0 is 
at least (— ii, R). The series may or may not converge for x = —R and x = R. Because 


S(0) = a 0 G R, 


any power series converges for x = 0. If the series does not converge for any x ^ 0, we 
write that R = 0. Conversely, if the series converges for all xgM, then R = oo. 

To calculate the radius of convergence of a power series, we can make use of 
d’Alembert’s ratio test : suppose that the limit 


L := lim 

n—> oo 


^n+1 

U n 


exists. Then, the series u n 


( 1 . 10 ) 
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(a) converges absolutely if L < 1; 

(b) diverges if L > 1. 

Remarks, (i) If the limit L in (1.10) does not exist, or if L = 1, then the test is incon¬ 
clusive. There exist other criteria that can be used, for instance, Raabe’s test. 

(ii) An infinite series u n converges absolutely if \ u n\ converges. A series that 

converges absolutely is also convergent. 

(iii) In the case of a power series, we calculate 


lim 

a n +ix n+1 

= lim 

^n+1 

n—»• oo 

a n x n 

n—>oo 

a n 


For example, the series expansion of the exponential function e cx given in (1.8) is valid 
for all x G M. Indeed, a n = c n /n\, so that 


lim 

n—>oc 


On+ lff ra+1 

a n x n 


lim 

n—> oo 


C 

n + 1 


\x\ = 0 < 1 


\/x G M. 


Example 1.4.1. To obtain the mean of a geometric random variable , we can compute 
the infinite sum 


Y j k{i-p) k -^=^-Y.k{i- P ) k (1 = 7) 

k=l P k=0 


P 1 ~P 

1 — p p 2 


1 

_ 5 

P 


where 0 < p < 1. To prove Formula (1.7), we can use (1.9). 

Example 1.4.2. A Poisson random variable is such that its probability function is given 
by 

Px(x) = e~ x — for x = 0,1,..., 
x\ 

where A is a positive constant. We have: 


oo oo > x 

5>x(*) = E e_ V 

x=0 x=0 



x=0 


X x (U8) 

x\ 


e~ x e x 


= 1 , 


as required. 

Example 1.4.3. The power series 
Sk(x) := 1 +kx+ — 1 b 2 


fc(fc-l)(fc-2)---(fc-n + l) ^ n + 


is the series expansion of the function (1 + x) k , and is called the binomial series. In 
probability, k will be a natural integer. It follows that the series has actually a finite 
number of terms and thus converges for all x G M. Moreover, we can then write that 
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fc(fc-l)(fc-2)---(fc-n + l) _ k\ 

n\ (k — n)\n\ 

for n = 1,... t k. 

To conclude this review of calculus, we give the main logarithmic formulas: 
lna6 = lna + lnfr; lna/6 = lna — ln6; lna b = blna. 

We also have: 

In e cx = e ln ( ca: ) = cx 

and 

e /0) \nx _ X f(x)' 


1.5 Exercises for Chapter 1 


Solved exercises 1 


Question no. 1 

Calculate lim^o x sin(l/x). 

Question no. 2 

For what values of x is the function 


f( x ) 


smx 


if x ^ 0, 


1 if x = 0 


continuous? 

Question no. 3 

Differentiate the function f(x) = \/3x + l(2x 2 + l) 2 . 

Question no. 4 

Find the limit lim^o x hix. 

Question no. 5 

Evaluate the definite integral 


h'= 



Inx 

- ax. 

x 


1 The solutions can be found in Appendix C. 
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Question no. 6 

Find the value of the definite integral 

/ oo 

x 3 e~ x2/2 dx. 

-OO 


Question no. 7 

Find the Fourier transform of the function 

f(x) = ce~ cx for x > 0, 


where c is a positive constant. 


Question no. 8 

Let 


Calculate 


£( x , x + y if 0 < x < 1,0 < y < 1, 
K X ’V) = \ 0 elsewhere. 


h ■= 


■= j J f(x,y)dxdy, 


where A := {(x, y) G M 2 : 0 < x < 1, 0 < y < 1, x 2 < y} (see Figure 1.3). 



Fig. 1.3. Region A in solved exercise no. 8. 


Question no. 9 

Find the value of the infinite series 


1 1 
8 + 16 + 32 + " 
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Question no. 10 

Calculate 

oo 

Sio '■= 

k =1 

where 0 < p < 1. 


Exercises 


Question no. 1 

Let 

0 if x < 0, 

F(x) = < 1/2 if x = 0, 

[ 1 - (1/2)6-® if x > 0. 

Calculate (a) lim x j 0 F(x), (b) lim x ^F(x), and (c) lim^o^x). 

Question no. 2 

Consider the function 


Fix) 


0 if x < 0, 

1/3 if 0 < x < 1, 
2/3 if 1 < x < 2, 
1 if x > 2. 

\ 


For what values of x is F(x) left-continuous? right-continuous? continuous? 

Question no. 3 

Find the limit as x tends to 0 of the function 


/ (x) 


x 2 sin(l/x) 
sinx 


for x G M. 


Question no. 4 

Is the function 


fi x ) 


e x ! x if x ^ 0, 

0 if x = 0 


continuous or discontinuous at x = 0? Justify. 


Question no. 5 

Find the fourth-order derivative of the function F(uj) = e~^ / 2 , for any wGi, and 
evaluate the derivative at uj = 0. 




1.5 Exercises for Chapter 1 


21 


Question no. 6 

Find the following limit: 

(l + x — %) e ~ x +( e / 2 ) — (l + x + §) 
lim --—---—- 

efO e 


Question no. 7 

Determine the second-order derivative of f(x) = \j2x 2 . 

Question no. 8 

Calculate the derivative of 

f(x) = 1 + ^ for x G M 

and find the value of x that minimizes this function. 

Question no. 9 

Use the fact that 


f 


v,n—1 —x 


dx = (n — 1)! for nm 1,2,.. 


to evaluate the integral 



where c is a positive constant. 


Question no. 10 

Use the following formula: 



e -( x -m) 2 /2 dx = 


\/27T, 


where m is a real constant, to calculate the definite integral 


f 


x 2 e x /2 dx. 


Question no. 11 

Evaluate the improper integral 


L 


oo 



X 
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Question no. 12 

Find a primitive of the function 


f(x) = e x sinx for x E M. 


Question no. 13 

Find the Fourier transform of the function 


f(x) = |e“ c|a:| for ieR, 


where c is a positive constant. 

Question no. 14 


Let 



\x if 1 < x < 2,1 < y < 2, x < y 


0 elsewhere. 


Calculate the double definite integral 


where 


// f(x, y ) dxdy, 

A 

A := {(x, y) G R 2 : 1 < x < 2,1 < y < 2, x 2 < y}. 


Question no. 15 

The convolution of two functions, / and g, is denoted by f * g and is defined by 



Let 



where c is a positive constant, and assume that g{x) = f(x) (i.e., g is identical to /). 
Find {f *g){x). 


Remark. In probability theory, the convolution of fx and fy is the probability density 
function of the sum Z := X+Y of the independent random variables X and Y. The result 
of the above exercise implies that the sum of two independent exponentially distributed 
random variables with the same parameter c has a gamma distribution with parameters 
a = 2 and A = c. 


Question no. 16 

Prove that 
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/ OO 1 

—r=e~ z2/,2 dz = 1 
- oo v 2?r 


by first writing that 


-i P OO P OO -| p OO p oo 

— / e~ z2/2 dz / e~ w2/2 dw = — / / e~^ 2+w2)/2 dzdw 

J ^ J—oo J—oo J — oo J—oo 




and then using polar coordinates. That is, set z = r cos# and w = r sin# (with r > 0), 
so that 

r = \/z 2 w 2 and 0 = ta 1 n~ 1 (w/z). 

Remark. We find that I 2 sl. Justify why this implies that 1 = 1 (and not / = —1). 

Question no. 17 

Determine the value of the infinite series 

1 2 is (-i) n n 

7T.X 2 - -X 3 + ■ ■ ■ + ^~Px n + ■ ■ ■ . 

2! 3! n\ 


Question no. 18 

Let 


where 0 < q < 1. Calculate 


S(q) = Y. ( l n ~ 1 

n—1 



S(q)dq. 


Question no. 19 

(a) Calculate the infinite series 


M{t) :=J2e tk e~^, 


k=0 


where a > 0. 

(b) Evaluate the second-order derivative M"(t) at t = 0. 

Remark. The function M(t) is the moment-generating function of a random variable X 
having a Poisson distribution with parameter a. Moreover, M"( 0) gives us the expected 
value of X 2 . 
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Question no. 20 

(a) Determine the value of the power series 

oo 

G(z) :=^> fe (l -p) k p, 

k =0 

where z G M and p G (0,1) are such that \z\ < (1 — p) _1 . 

(b) Calculate 

d k 

j^G(z) for A: = 0,1,... 

at £ = 0. 

Remark. The function Gx{z) : = YlkLo zk Px{k) is called the generating function of the 
discrete random variable X taking its values in the set {0,1,...} and having probability 
mass function px(k). Furthermore, 


1_ d*_ 
k\ dz k 


Gx{z) 


z =0 


yields px{k). 


Multiple choice questions 


Question no. 1 

Calculate the limit 


(a) —1 (b) —1/2 (c) 1/2 (d) 1 (e) does not exist 

Question no. 2 

Let u(x) be the Heaviside function (see p. 2) and define h{x) = e u ^/ 2 , for xGi 
Calculate the limit of the function h{x) as x tends to 0. 

(a) 1/2 (b) 1 (c) (1 + e 1 / 2 )/2 (d) e 1 / 2 (e) does not exist 

Question no. 3 

Evaluate the second-order derivative of the function 

FM = 2 ~B$y for “ €R 

at uj = 0. 

a) -1/4 (b) -1/2 (c) 0 (d) 1/4 (e) 1/2 


lim 


y/x- 1 
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Question no. 4 

Find the following limit: 



lim 1 H— 


x 


Indication. Take the natural logarithm of the expression first and then use the fact that 
In x is a continuous function. 

(a) 0 (b) 1 (c) e (d) oc (e) does not exist 

Question no. 5 

Use the formula 



where a and b (> 0) are constants, to evaluate the definite integral 



(a) 1/4 (b) 1/2 (c) 1 (d) 2 (e) 4 

Question no. 6 

Calculate the definite integral 



(a) —oo (b) —1 (c) —1/2 (d) —1/4 (e) 0 

Question no. 7 

Suppose that 



1 if 0 < x < 1, 
0 elsewhere 


and that g{pc) = f(x) for all xgM. Find (/ * g)( 3/2), where * denotes the convolution 
of / and g (see no. 15, p. 22). 

(a) 0 (b) 1/2 (c) 1 (d) 3/2 (e) 2 

Question no. 8 


Let 



( 2 — x — y if 0 < x < 1,0 < y < 1, 
| 0 elsewhere. 


Calculate the following double integral: 


J j f(x,y)dxdy, 
A 
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1 Review of differential calculus 


where 

A := {(#, y) e R 2 : 0 < x < 1, 0 < y < 1, x y > 1} 

(a) 1/6 (b) 1/4 (c) 1/3 (d) 1/2 (e) 5/6 

Question no. 9 

Find the value of the natural logarithm of the infinite product 

oo 

Yl e 1 / 2 " = e 1 / 2 x e 1 / 4 x e 1 / 8 x • • • . 

n= 1 


(a) -1 (b) -1/2 (c) 0 (d) 1/2 (e) 1 

Question no. 10 

We define oq = 0 and 


a k = e 


'm 


for k = 


) — 2 , — 1 , 1 , 2 ,.. 


where c > 0 is a constant. Find the value of the infinite series XlfcL- 
(a) 1 — e _c (b) 0 (c) 1 (d) (l/2)e -c (e) 1 + e~ c 
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Elementary probability 


This first chapter devoted to probability theory contains the basic definitions and con¬ 
cepts in this field, without the formalism of measure theory. However, the range of 
problems that can be solved by using the formulas of elementary probability is very 
broad, particularly in combinatorial analysis. Therefore, it is necessary to do numerous 
exercises in order to master these basic concepts. 

2.1 Random experiments 

A random experiment is an experiment that, at least theoretically, may be repeated as 
often as we want and whose outcome cannot be predicted, for example, the roll of a 
die. Each time the experiment is repeated, an elementary outcome is obtained. The set 
of all elementary outcomes of a random experiment is called the sample space , which is 
denoted by Q. 

Sample spaces may be discrete or continuous. 

(a) Discrete sample spaces, (i) Firstly, if the number of possible outcomes is finite. 
For example, if a die is rolled and the number that shows up is noted, then Q = 
{1,2,...,6}. 

ii) Secondly, if the number of possible outcomes is countably infinite , which means 
that there is an infinite number of possible outcomes, but these outcomes can be put 
in a one-to-one correspondence with the positive integers. For example, if a die is rolled 
until a “6” is obtained, and the number of rolls made before getting this first “6” is 
counted, then we have that Q = {0,1,2,...}. This set is equivalent to the set of all 
natural integers {1,2,...}, because we can associate the natural number fc +1 with each 
element k = 0,1,... of Q. 

(b) Continuous sample spaces. If the sample space contains one or many intervals, 
the sample space is then uncountably infinite. For example, a die is rolled until a “6” 
is obtained and the time needed to get this first “6” is recorded. In this case, we have 
that Q = {t G M: t > 0} [or Q = (0, oc)]. 
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2.2 Events 

Definition 2.2.1. An event is a set of elementary outcomes. That is, it is a subset of 
the sample space 12. In particular, every elementary outcome is an event, and so is the 
sample space itself. 

Remarks, (i) An elementary outcome is sometimes called a simple event, whereas a 
compound event is made up of at least two elementary outcomes. 

(ii) To be precise, we should distinguish between the elementary outcome cj, which is 
an element of !?, and the elementary event {a;} C 12. 

(iii) The events are denoted by A, B , C, and so on. 

Definition 2.2.2. Two events, A and B, are said to be incompatible (or mutually 
exclusive,) if their intersection is empty. We then write that A D B = 0. 

Example 2.2.1. Consider the experiment that consists in rolling a die and recording 
the number that shows up. We have that 12 = {1,2, 3,4, 5, 6}. We define the events 

A = {1,2,4}, B = { 2,4,6} and C = {3,5}. 

We have: 

AU5 = {1,2,4,6}, AnB = { 2,4} and A n C = 0. 

Therefore, A and C are incompatible events. Moreover, we may write that A' = {3, 5, 6}, 
where the symbol ' denotes the complement of the event. 

To represent a sample space and some events, we often use a Venn diagram as in 
In general, for three events we have the diagram in Figure HD 


Figure 2.1. 





Fig. 2.1. Venn diagram for Example 2.2.1. 
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Q i 



ox. = AOBTIC 

5 

C0 6 = ATIBflC 1 
co 7 = ATIBTIC 
co g = ATIBTIC 1 


Fig. 2.2. Venn diagram for three arbitrary events. 


2.3 Probability 

Definition 2.3.1. The probability of an event A C fl, denoted by P[A\, is a real 
number obtained by applying to A the function P that possesses the following properties: 


(i) 0 < P[A] < 1; 

(ii) if A = Q, then P[A\ = 1; 

(Hi) if A = A\ U A 2 U • • • U A n , where A 1 ,..., A n are incompatible events, then we may 
write that 

n 

p [ A ]=Tj P [ A i] f°m = 2,3, ...,oo. 

i= 1 

Remarks, (i) Actually, we only have to write that P[A] >0 in the definition, because 
we can show that 

P[A\ + P[A'} = 1, 

which implies that P[A\ = 1 — P[A f ] < 1. 

(ii) We also have the following results: 

P[0] = 0 and P[A\ < P[B] if A c B. 


(iii) The definition of the probability of an event is motivated by the notion of relative 
frequency. For example, suppose that the random experiment that consists in rolling a 
die is repeated a very large number of times, and that we wish to obtain the probability 
of any of the possible outcomes of this experiment, namely, the integers 1, 2,..., 6. The 
relative frequency of the elementary event {k} is the quantity f{k}(n) defined by 


f{k}(n) 


N {k}(n) 


n 
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where N^y(n) is the number of times that the possible outcome k occurred among the 
n rolls of the die. We can write that 

0 < f{k}(n) < 1 for k = 1,2,... ,6 


and that 

6 

XdwN = l 

k=i 

Indeed, we obviously have that Ny k }(n) G {0,1,. .., n}, so that /{/ c }(n) belongs to [0,1], 
and 

V/ W (n) = %>(") + ■•• +(") = = i. 

k=l 

Furthermore, if A is an event containing two possible outcomes, for instance “1” and 
“2,” then 

fA(n ) = /{i}(n) + /{ 2 }(n), 

because the outcomes 1 and 2 cannot occur on the same roll of the die. 

Finally, the probability of the elementary event {k} can theoretically be obtained 
by taking the limit of as the number n of rolls tends to infinity: 

P[{k}} = lim f {k }(n). 

The probability of an arbitrary event can be expressed in terms of the relative frequency 
of this event, thus it is logical that the properties of probabilities more or less mimic 
those of relative frequencies. 

Sometimes, the probability of an elementary outcome is simply equal to 1 divided 
by the total number of elementary outcomes. In this case, the elementary outcomes are 
said to be equiprobable (or equally likely). For example, if a fair (or unbiased ) die is 
rolled, then we have that P[{ 1}] = P[{ 2}] = • • • = P[{6}] = 1/6. 

If the elementary outcomes r* are not equiprobable, we can (try to) make use of the 
following formula: 

P[A] = ]T P[{ ri }]. 

TiEA 

However, this formula is only useful if we know the probability of all the elementary 
outcomes r* that constitute the event A. 

Now, if A and B are incompatible events, then we deduce from the third property 
of P[ • ] that P[A U B] = P[A] + P[B\. If A and B are not incompatible, we can show 
(see Figure that 


P[A U B}= P[A] + P[B] - P[A n B], 
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Q 



Fig. 2.3. Probability of the union of two arbitrary events. 


Similarly, in the case of three arbitrary events, we have: 

P[AUBU C] =P[A\ + P[B\ + P[C\ - P[A n B] - P[A nc\- P[B n C] + P[A n B n C ]. 

Example 2.3.1. The three most popular options for a certain model of new car are A: 
automatic transmission, B: V6 engine, and C: air conditioning. Based on the previous 
sales data, we may suppose that P[A] = 0.70, P[B] = 0.75, P[C] = 0.80, P[A U B] 
= 0.80, P[A U C] = 0.85, P[B U C\ = 0.90, and P[AU BUC] = 0.95, where P[A\ 
denotes the probability that an arbitrary buyer chooses option A, and so on. Calculate 
the probability of each of the following events: 

(a) the buyer chooses at least one of the three options; 

(b) the buyer does not choose any of the three options; 

(c) the buyer chooses only air conditioning; 

(d) the buyer chooses exactly one of the three options. 

Solution, (a) We seek P[A U B U C] = 0.95 (by assumption). 

(b) We now seek P[A' n B' n C'] = 1 — P[A U B U C] = 1 - 0.95 = 0.05. 

(c) The event whose probability is requested is A! D B' D C . We can write that 

P[A' n B' n C] = P[A UBUC}- P[A U B]= 0.95 - 0.8 = 0.15. 

(d) Finally, we want to calculate 

P[(A HB'n c') u {A' n b n c') u (A' nB'n C)] 

=• p[A HB'n C] + P[A' nBnC'} + P[A' nB'nC] 

= 3 P[A U B U C] - P[A U B] - P[A U C] - P[B U C\ 

= 3(0.95) - 0.8 - 0.85 - 0.9 = 0.3. 

Remarks, (i) The indication “inc.” above the “=” sign means that the equality is true 
because of the incompatibility of the events. We use this type of notation often in this 
book to justify the passage from an expression to another. 
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(ii) The probability of each of the eight elementary outcomes is indicated in the diagram 
of Figure j.4. | First, we calculate 

P[A n B} = P[A] + P[B} - P[A UB} = 0.7 + 0.75 - 0.8 = 0.65. 

Likewise, we have: 


P[A OC] = 0.7 + 0.8 - 0.85 = 0.65, 

P[B n C] = 0.75 + 0.8 - 0.9 = 0.65, 

P[A n B n C] = P[A U B U C] - P[A] - P[B} - P[C} 

+ P[A nB] + P[A nC]+ P[B n C] 

= 0.95 - 0.7 - 0.75 - 0.8 + 3(0.65) = 0.65. 


Q 



Fig. 2.4. Venn diagram for Example 2.3.1. 


2.4 Conditional probability 


Definition 2.4.1. The conditional probabili ty o f event A, given that event B oc¬ 
curred, is defined (and denoted) by (see Figure 2.5) 


P[A | B] = P[ )( B ( ] ifP[B}> 0. 

From the above definition, we obtain the multiplication rule: 

P[A n .B] = P[A | B]P[B] ifP[£]>0 


( 2 . 1 ) 

( 2 . 2 ) 


and 


P[Af)B] = P[B | A]P[A] if P[A\ > 0. 
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Fig. 2.5. Notion of conditional probability. 


Definition 2.4.2. Let A and B be two events such that P[A]P[B] > 0. We say that A 
and B are independent events if 

P[A | B] = P[A\ or P[B \ A] = P[B}. (2.3) 

We deduce from the multiplication rule that A and B are independent if and only 

if (iff) 

P[AnB}= P[A]P[B}. (2.4) 

Actually, this equation is the definition of independence of events A and B in the 
general case when we can have that P[A]P[B] = 0. However, Definition 2.4.2 is more 
intuitive, whereas the general definition of independence given by Formula (2.4) is purely 
mathematical. 

In general, the events Ai, A 2 , ..., A n are independent iff 

k 

P[A il n--nA ik ] = l[p[A ij ] 

3 = 1 


for k = 2 , 3,..., n, where A it ^ A irn if l 7 ^ m. 

Remark. If A and B are two incompatible events, then they cannot be independent, 
unless P[A]P[B\ = 0. Indeed, in the case when P[A]P[B] > 0, we have: 


P[A | B\ 


p[A n B] 

P\B\ 


m 

P[B] 


0 

P\B\ 


0 ±P[A\. 


Example 2.4.1. A device is constituted of two compo nents , A and H, subject to failures 


The components are connected in parallel (see Figure 2.6) and are not independent. We 


estimate the probability of a failure of component A to be 0.2 and that of a failure of 
component B to be 0.8 if component A is down, and to 0.4 if component A is not down. 


(a) Calculate the probability of a failure (i) of component B and (ii) of the device. 
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Fig. 2.6. System for part (a) of Example 2.4.1. 


Solution. Let A (resp., P) be the event “component A (resp., B ) is down.” By assump¬ 
tion, we have that P[A] = 0.2, P[B \ A] = 0.8, and P[B \ A'] = 0.4. 

(i) We may write (see Figure £23 that 


Q 



Fig. 2.7. Venn diagram for part (a) of Example 2.4.1. 


P[B\ = P[A n B] - 

= ( 0 . 8 )( 0 . 2 ) 


■ P[A' n P] = P[B 
- (0.4)(0.8) = 0.48. 


A]P[A] + P[P | A']P[A r | 


(ii) We seek P[Device failure] = P[A D B] = P[B | A]P[t4] = 0.16. 

(b) In order to increase the reliability of the device, a third component, C, is added 
to the syste m in such a way that components A , P, and C are connected in parallel 
(see Figure 2 ^ The probability that component C fails is equal to 0.2, independently 


from the state (up or down) of components A and B. Calculate the probability that the 
device made up of components A, P, and C breaks down. 


Solution. By assumption, P[C] = 0.2 and C is independent of A and P. Let F be the 
event “the subsystem made up of components A and P fails.” We can write that 

P[F n C] [ =- P[A n B]P[C] (a = li} (0.16)(0.2) = 0 . 032 . 
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Fig. 2.8. System for part (b) of Example 2.4.1. 


2.5 Total probability 

Let B \, i? 2 ,..., B n be incompatible and exhaustive events; that is, we have: 

n 

Bi n Bj == 0 if i j> and Bi = i?. 

i=i 

We say that the events constitute a partition of the sample space Q. It follows that 


l }Bi 

i=1 


= Y J P[Bi\=P[f2} = l. 


i=1 


(2.5) 


Now, let A be an arbitrary event. We can write that (see Figure 5.9) 

n n 

P[A] =Y J P[A^ B^ = Y,P[A | Bi]P[Bi] 

i=l i= 1 

(the second equality above being valid when P[Bj\ >0, for i = 1 , 2, ..., n). 

Remark. This formula is sometimes called the law of total probability. 

Finally, suppose that we wish to calculate P[Bi \ A\, for i = 1 ,..., n. We have: 

PfR I 41 = p [^ n ^] = P l A I = ^ I 

1 * 1 J P[A] e;=i p\a n Bj] E u P[A I «,]/>;/*,] • 

This formula is called Bayes ’ formula. 

Remark. We also have ( Bayes 7 rule): 

P[B I A] = if P[A]P[B] > 0. (2.7) 


( 2 . 6 ) 
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Q 



PtAEPtAnBj+PtAnBj+PtAnBj 

Fig. 2.9. Example of the law of total probability with n — 3. 


Example 2.5.1. Suppose that machines Mi, M 2 , and M 3 produce, respectively, 500, 
1000, and 1500 parts per day, of which 5%, 6 %, and 7% are defective. A part produced 
by one of these machines is taken at random, at the end of a given workday, and it is 
found to be defective. What is the probability that it was produced by machine M 3 ? 


Solution. Let Ai be the event “the part taken at random was produced by machine 
M^,” for z = 1,2, 3, and let D be “the part taken at random is defective.” We seek 


P[A 3 | D\ 


P[D I A 3 ]P[A 3 ] 
Y*=iP[D\A 1 ]P[A 1 ] 


105 

190 


~ 0.5526. 


(0-07) (S 

(0.05) (|) + (0.06) (|) + (0.07) (|) 


2.6 Combinatorial analysis 


Suppose that we perform a random experiment that can be divided into two steps. On 
the first step, outcome A\ or outcome A 2 may occur. On the second step, either of 
outcomes B 1 , L> 2 , or B% may occur. We can us e a tre e diagram to describe the sample 
space of this random experiment, as in Figure 


. 10 . 


Example 2.6.1. Tests conducted with a new breath alcohol analyzer enabled us to 
establish that (i) 5 times out of 100 the test proved positive even though the person 
subjected to the test was not intoxicated; (ii) 90 times out of 100 the test proved 
positive and the person tested was really intoxicated. Moreover, we estimate that 1% of 
the persons subjected to the test are really intoxicated. Calculate the probability that 

(a) the test will be positive for the next person subjected to it; 

(b) a given person is intoxicated, given that the test is positive. 
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1 st step 2nd step 



Fig. 2.10. Example of a tree diagram. 


Solution. Let A be the event “the test is positive” and let E be “the person subjected to 
the test is intox icated.” From the above assumptions, we can construct the tree diagram 
in Figure [2.11, | where the marginal probabilities of events E and E' are written above 
the branches, as well as the conditional probabilities of events A and A!, given that E 
or E' occurred. Furthermore, we know by the multiplication rule that the product of 
these probabilities is equal to the probability of the intersections E D A, E D A', and so 
on. 



(a) We have: 


P[A\ = P[E PiA\+ P[E' D A] = 0.0585. 


(b) We calculate 


P[E | A] 


P[EnA] (a) 0.009 
P[A\ ~ 0.0585 


0.1538. 


Note that this probability is very low. If we assume that 60% of the individuals sub¬ 
jected to the test are intoxicated (rather than 1%), then we find that P[A] becomes 
0.56 and P[E \ A] ~ 0.9643, which is much more reasonable. Therefore, this breath 
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alcohol analyzer is only efficient if we use it for individuals who are suspected of being 
intoxicated. 


Remark. In general, if a random experiment comprises k steps and if there are rij possible 
outcomes on the j th step, for j = 1,..., fc, then there are n\ x • • • x elementary 
outcomes in the sample space. This is known as the multiplication principle. 

Suppose now that we have n distinct objects and that we take, at random and 
without replacement, r objects among them, where r G {(0, )1,..., n}. The number of 
possible arrangements is given by 

n t 

n x (n — 1) x • • • x [n — (r — 1)1 = 7 -—-- := Pit. (2.8) 

[n — r)\ 

The symbol P™ designates the number of permutations of n distinct objects taken r at 
a time. The order of the objects is important. 

Remarks, (i) Reminder. We have that n\ = 1 x 2 x • ■ • x n, for n = 1, 2, 3,..., and 0! = 1, 
by definition. 

(ii) Taking r objects without replacement means that the objects are taken one at a 
time and that a given object cannot be chosen more than once. This is equivalent to 
taking the r objects all at once. In the case of sampling with replacement, any object 
can be chosen up to r times. 


Example 2.6.2. If we have four different letters (for instance, a, b, c, 
can form 


4! 

(4-3)! 



and d), then we 


different three-letter “words” if each letter is used at most once. We can use a tree 
diagram to draw the list of words. 


Finally, if the order of the objects is not important, then the number of ways to take, 
at random and without replacement, r objects among n distinct objects is given by 


n x (n — 1) x • • • x [n — (r — 1)] 
r! 


r\(n — r)\ 


:=CT 



(2.9) 


for r G {(0, )1,..., n}. The symbol C™, or (™), designates the number of combinations 
of n distinct objects taken r at a time. 

Remarks, (i) Each combination of r objects enables us to form r! different permutations, 
because 


(ii) Moreover, it is easy to check that C™ = C™_ r . 
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Example 2.6.3. Three parts are taken, at random and without replacement, among 10 
parts, of which 2 are defective. What is the probability that at least 1 defective part is 
obtained? 

Solution. Let F be the event “at least one part is defective among the three parts 
taken at random.” We can write that 



P[F] = 1 - P[F'] = 1 - 


10 ! 
3! 7! 


2.7 Exercises for Chapter 2 


Solved exercises 


Question no. 1 

We consider the following random experiment: a fair die is rolled; if (and only if) 
a “ 6 ” is obtained, the die is rolled a second time. How many elementary outcomes are 
there in the sample space 12 ? 

Question no. 2 

Let 12 — {ei, e 2 , e 3 }, where P [{e^}] > 0, for i = 1 , 2 , 3. How many different partitions 
of 12 , excluding the partition 0 , 12 can be formed? 

Question no. 3 

A fair die is rolled twice, independently. Knowing that an even number was obtained 
on the first roll, what is the probability that the sum of the two numbers obtained is 
equal to 4? 

Question no. 4 

Suppose that P[A] = P[B\ = 1/4 and that P [A \ B] = P [B]. Calculate P[A D B']. 

Question no. 5 

A system is made up of three independent components. It operates if at least two 
of the three components operate. If the reliability of each component is equal to 0.95, 
what is the reliability of the system? 

Question no. 6 

Suppose that P [A D B\ = 1/4, P [A \ B'] = 1/8, and P [B\ = 1 / 2 . Calculate P [A]. 

Question no. 7 

Knowing that we obtained at least once the outcome “heads” in three independent 
tosses of a fair coin, what is the probability that we obtained “heads” three times? 
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Question no. 8 

Suppose that P [B \ A\\ = 1/2 and that P [B \ A 2 \ = 1/4, where A\ and A 2 are two 
equiprobable events forming a partition of Q. Calculate P [A\ \ B]. 

Question no. 9 

Three horses, a, 6, and c, enter in a race. If the outcome bac means that b finished 
first, a second, and c third, then the set of all possible outcomes is 

Q = {a6c, ac6, 6ac, 6ca, ca6, cba} . 

We suppose that P [{ abc }] = P [{acb }] = 1/18 and that each of the other four elementary 
outcomes has a 2/9 probability of occurring. Moreover, we define the events 

A = “a finishes before 5” and B = “a finishes before c.” 

(a) Do the events A and B form a partition of 12? 

(b) Are A and B independent events? 

Question no. 10 

Let 5 be a random experiment for which there are three elementary outcomes: A, B , 
and C. Suppose that we repeat e indefinitely and independently. Calculate, in terms of 
P [A] and P [B], the probability that A occurs before B. 

Hints, (i) You can make use of the law of total probability. 

(ii) Let D be the event U A occurs before B .” Then, we may write that 

P [D | C occurs on the first repetition] = P [D\ . 


Question no. 11 

Transistors are drawn at random and with replacement from a box containing a 
very large number of transistors, some of which are defectless and others are defective, 
and are tested one at a time. We continue until either a defective transistor has been 
obtained or three transistors in all have been tested. Describe the sample space Q for 
this random experiment. 

Question no. 12 

Let A and B be events such that P [A] = 1/3 and P[B' \ A] =5/7. Calculate P [B] 
if B is a subset of A. 

Question no. 13 

In a certain university, the proportion of full, associate, and assistant professors, 
and of lecturers is 30%, 40%, 20%, and 10% respectively, of which 60%, 70%, 90%, and 
40% hold a PhD. What is the probability that a person taken at random among those 
teaching at this university holds a PhD? 

Question no. 14 

All the items in stock in a certain store bear a code made up of five letters. If the 
same letter is never used more than once in a given code, how many different codes can 
there be? 
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Question no. 15 

A fair die is rolled twice, independently. Consider the events 
A = “the first number that shows up is a 6;” 

B = “the sum of the two numbers obtained is equal to 7;” 

C = “the sum of the two numbers obtained is equal to 7 or 11.” 

(a) Calculate P [B \ C\. 

(b) Calculate P [A \ B \. 

(c) Are A and B independent events? 

Question no. 16 

A commuter has two vehicles, one being a compact car and the other one a minivan. 
Three times out of four, he uses the compact car to go to work and the remainder of 
the time he uses the minivan. When he uses the compact car (resp., the minivan), he 
gets home before 5:30 p.m. 75% (resp., 60%) of the time. However, the minivan has air 
conditioning. Calculate the probability that 

(a) he gets home before 5:30 p.m. on a given day; 

(b) he used the compact car if he did not get home before 5:30 p.m.; 

(c) he uses the minivan and he gets home after 5:30 p.m.; 

(d) he gets home before 5:30 p.m. on two (independent) consecutive days and he does 
not use the same vehicle on these two days. 

Question no. 17 

Rain is forecast half the time in a certain region during a given time period. We 
estimate that the weather forecasts are accurate two times out of three. Mr. X goes out 
every day and he really fears being caught in the rain without an umbrella. Consequently, 
he always carries his umbrella if rain is forecast. Moreover, he even carries his umbrella 
one time out of three if rain is not forecast. Calculate the probability that it is raining 
and Mr. X does not have his umbrella. 

Question no. 18 

A fair die is rolled three times, independently. Let F be the event “the first number 
obtained is smaller than the second one, which is itself smaller than the third one.” 
Calculate P[F]. 

Question no. 19 

We consider the set of all families having exactly two children. We suppose that each 
child has a 50-50 chance of being a boy. Let the events be 
A\ = “both sexes are represented among the children;” 

A 2 = “at most one child is a girl.” 

(a) Are A\ and A 2 incompatible events? 

(b) Are A\ and A' 2 independent events? 

(c) We also suppose that the probability that the third child of an arbitrary family is a 
boy is equal to 11/20 if the first two children are boys, to 2/5 if the first two children 
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are girls, and to 1/2 in the other cases. Knowing that the third child of a given family 
is a boy, what is the probability that the first two are also boys? 

Exercises 


Question no. 1 

We study the traffic (in one direction) on two roads, 1 and 2, which merge to form 
road 3 (see Figure $.12)[ Roads 1 and 2 have the same capacity (number of lanes) and 
road 3 has a greater capacity than road 1 and road 2. During rush hours, the probability 
that the traffic is congested on road 1 (resp., road 2) is equal to 0.1 (resp., 0.3). Moreover, 
given that traffic is congested on road 2, it is also congested on road 1 one time out of 
three. We define the events 

A,B,C = “traffic is congested on roads 1, 2, 3, respectively.” 



(a) Calculate the probability that traffic is congested 

(i) on roads 1 and 2; 

(ii) on road 2, given that it is congested on road 1; 

(iii) on road 1 only; 

(iv) on road 2 only; 

(v) on road 1 or on road 2; 

(vi) neither on road 1 nor on road 2. 

(b) On road 3, traffic is congested with probability 

1 if it is congested on roads 1 and 2; 

0.15 if it is congested on road 2 only; 

0.1 if it is neither congested on road 1 nor on road 2. 

Calculate the probability that traffic is congested 

(i) on road 3; 

(ii) on road 1, given that it is congested on road 3. 
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Question no. 2 

We roll a die and then we toss a coin. If we obtain “tails,” then we roll the die a 
second time. Suppose that the die and the coin are fair. What is the probability of 

(a) obtaining “heads” or a 6 on the first roll of the die; 

(b) obtaining no 6s; 

(c) obtaining exactly one 6; 

(d) having obtained “heads,” given that we obtained exactly one 6. 

Question no. 3 (see Example 2.4.1) 

A device is composed of two components, A and £?, subject to random failures. The 
components are connected in parallel and, consequently, the device is down only if both 
components are down. The two components are not independent. We estimate that the 
probability of 

a failure of component A is equal to 0.2; 

a failure of component B is equal to 0.8 if component A is down; 
a failure of component B is equal to 0.4 if component A is active. 

(a) Calculate the probability of a failure 

(i) of component A if component B is down; 

(ii) of exactly one component. 

(b) In order to increase the reliability of the device, a third component, (7, is added in 
such a way that components A, E?, and C are connected in parallel. The probability 
that component C breaks down is equal to 0.2, independently of the state (up or down) 
of components A and B. Given that the device is active, what is the probability that 
component C is down? 

Question no. 4 

In a factory producing electronic parts, the quality control is ensured through three 
tests as follows: 

• each component is subjected to test no. 1; 

• if a component passes test no. 1, then it is subjected to test no. 2; 

• if a component passes test no. 2, then it is subjected to test no. 3; 

• as soon as a component fails a test, it is returned for repair. 

We define the events 

Ai = “the component fails test no. i, for i = 1,2,3.” 

From past experience, we estimate that 

P [Ai\ = 0.1, P [A 2 | Ai] = 0.05 and P [A 3 | A[ fl A' 2 \ = 0.02. 

The elementary outcomes of the sample space Q are: = A 1 , uo 2 = A[ D A 2 , ^ 3 = 

A[ D A 2 D A 3 , and CJ 4 = A[ fl A 2 fl A 3 . 
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(a) Calculate the probability of each elementary outcome. 

(b) Let R be the event “the component must be repaired.” 

(i) Express R in terms of A i, A 2 , A 3 . 

(ii) Calculate the probability of R. 

(iii) Calculate P [A[ D A 2 \ R]. 

(c) We test three components and we define the events 

Rk — “the kth component must be repaired, for k = 1, 2, 3” and 

B = “at least one of the three components passes all three tests.” 

We assume that the events R^ are independent. 

(i) Express B in terms of 

(ii) Calculate P[B]. 

Question no. 5 

Let A, B , and C be events such that P[A\ = 1 / 2 , P[B\ = 1/3, P[C] = 1/4, and 
P [A n C] = 1 / 12 . Furthermore, A and B are incompatible. Calculate P [A \ B U C]. 

Question no. 6 

In a group of 20,000 men and 10,000 women, 6 % of men and 3% of women suffer 
from a certain disease. What is the probability that a member of this group suffering 
from the disease in question is a man? 

Question no. 7 

Two tokens are taken at random and without replacement from an urn containing 
10 tokens, numbered from 1 to 10. What is the probability that the larger of the two 
numbers obtained is 3? 

Question no. 8 

We consider the system in Figure $.13. >n components fail independently of each 
other. During a certain time period, the type A components fail with probability 0.3 and 
component B (resp., C ) fails with probability 0.01 (resp., 0.1). Calculate the probability 
that the system is not down at the end of this period. 

Question no. 9 

A sample of size 20 is drawn (without replacement) from a lot of infinite size con¬ 
taining 2 % defective items. Calculate the probability of obtaining at least one defective 
item in the sample. 

Question no. 10 

A lot contains 10 items, of which one is defective. The items are examined one 
by one, without replacement, until the defective item has been found. What is the 
probability that this defective item will be (a) the second item examined? (b) the ninth 
item examined? 




2.7 Exercises for Chapter 2 


45 



Fig. 2.13. Figure for Exercise no. 8. 


Question no. 11 

A bag holds two coins: a fair one and one with which we always get “heads.” A coin 
is drawn at random and is tossed. Knowing that “heads” was obtained, calculate 

(a) the probability that the fair coin was drawn; 

(b) the probability of obtaining “heads” on a second toss of the same coin. 

Question no. 12 

The diagnosis of a physician in regard to one of her patients is unsure. She hesitates 
between three possible diseases. From past experience, we were able to construct the 
following tables: 



where the T^s represent the diseases and the S*s are the symptoms. In addition, we 
assume that the four symptoms are incompatible, exhaustive, and equiprobable. 

(a) Independently of the symptom present in the patient, what is the probability that 
he or she suffers from the first disease? 

(b) What is the probability that the patient suffers from the second disease and presents 
symptom Si? 

(c) Given that the patient suffers from the third disease, what is the probability that 
he or she presents symptom 62 ? 

(d) We consider two independent patients. What is the probability that they do not 
suffer from the same disease, if we assume that the three diseases are incompatible? 
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Question no. 13 

We consider a system comprisi ng fou r components operating independently of each 
other and connected as in Figure 


.14. 



A 


o 



1 





4 


3 


Fig. 2.14. Figure for Exercise no. 13. 


The reliability of each component is supposed constant, over a certain time period, 
and is given by the following table: 


Component 

12 3 4 

Reliability 

0.9 0.95 0.95 0.99 


(a) What is the probability that the system operates at the end of this time period? 

(b) What is the probability that component no. 3 is down and the system still operates? 

(c) What is the probability that at least one of the four components is down? 

(d) Given that the system is down, what is the probability that it will resume operating 
if we replace component no. 1 by an identical (nondefective) component? 

Question no. 14 

A box contains 8 brand A and 12 brand B transistors. Two transistors are drawn 
at random and without replacement. What is the probability that they are both of the 
same brand? 


Question no. 15 

What is the reliability of the system shown in Figure 
operate independently of each other and all have a reliability equal to 0.9 at a given 
time instant? 


!.15 if the four components 
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Fig. 2.15. Figure for Exercise no. 15. 


Question no. 16 

Let A\ and A 2 be two events such that P [Ai\ = 1/4, P[A\ D A 2 ] = 3/16, and 
P [A 2 I = 1/8. Calculate P [A' 2 \. 

Question no. 17 

A fair coin is tossed until either “heads” is obtained or the total number of tosses 
is equal to 3. Given that the random experiment ended with “heads,” what is the 
probability that the coin was tossed only once? 

Question no. 18 

In a room, there are four 18-year-old male students, six 18-year-old female students, 
six 19-year-old male students, and x 19-year-old female students. What must be the 
value of x, if we want age and sex to be independent when a student is taken at random 
in the room? 

Question no. 19 

Stores Si, S 2 , and S 3 of the same company have, respectively, 50, 70, and 100 
employees, of which 50%, 60%, and 75% are women. A person working for this company 
is taken at random. If the employee selected is a woman, what is the probability that 
she works in store S 3 ? 

Question no. 20 

Harmful nitrogen oxides constitute 20%, in terms of weight, of all pollutants present 
in the air in a certain metropolitan area. Emissions from car exhausts are responsible 
for 70% of these nitrogen oxides, but for only 10% of all the other air pollutants. What 
percentage of the total pollution for which emissions from car exhausts are responsible 
are harmful nitrogen oxides? 

Question no. 21 

Three machines, Mi, M 2 , and M 3 , produce, respectively, 1%, 3%, and 5% defective 
items. Moreover, machine Mi produces twice as many items on an arbitrary day as 
machine M 2 , which itself produces three times as many items as machine M 3 . An item 
is taken at random among those manufactured on a given day, then a second item is 
taken at random among those manufactured by the machine that produced the first 
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selected item. Knowing that the first selected item is defective, what is the probability 
that the second one is also defective? 


Question no. 22 


A machine is made up of five components connected as in the diagram of Figure 2.16. 


Each component operates with probability 0.9, independently of the other components. 



Fig. 2.16. Figure for Exercise no. 22. 


(a) Knowing that component no. 1 is down, what is the probability that the machine 
operates? 

(b) Knowing that component no. 1 is down and that the machine still operates, what 
is the probability that component no. 3 is active? 

Question no. 23 

Before being declared to conform to the technical norms, devices must pass two 
quality control tests. According to the data gathered so far, 75% of the devices tested 
in the course of a given week passed the first test. The devices are subjected to the 
second test, whether they pass the first test or not. We found that 80% of the devices 
that passed the second test had also passed the first one. Furthermore, 20% of those 
that failed the second test had passed the first one. 

(a) What is the probability that a given device passed the second test? 

(b) Find the probability that, for a given device, the second test contradicts the first 
one. 

(c) Calculate the probability that a given device failed the second test, knowing that it 
passed the first one. 

Question no. 24 

In a certain workshop, the probability that a part manufactured by an arbitrary 
machine is nondefective, that is, conforms to the technical norms, is equal to 0.9. The 
quality control engineer proposes to adopt a procedure that classifies as nondefective 
with probability 0.95 the parts indeed conforming to the norms, and with only prob¬ 
ability 0.15 those not conforming to these norms. It is decided that every part will be 
subjected to this quality control procedure twice, independently. 
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(a) What is the probability that a part having passed the procedure twice does indeed 
conform to the norms? 

(b) Suppose that if a part fails the first control test, then it is withdrawn immediately. 
Let Bj be the event “a given part passed (if the case may be) the j th control test,” for 
j = 1,2. Calculate (i) P[B 2 ] and (ii) P[B[ D B' 2 \. 

Question no. 25 

We have 20 type I components, of which 5 are defective, and 30 type II components, 
of which 15 are defective. 


(a) We wish to construct a system comprising 10 type I components and 5 type II 
components connected in series. What is the probability that the system will operate if 
the components are taken at random? 

(b) How many different systems comprising four components connected in series, of 
which at least two are of type I, can be constructed, if the order of the components is 
taken into account? 

Remarks, (i) We suppose that we can differentiate two components of the same type. 

(ii) When a system is made up of components connected in series , then it operates if 
and only if every component operates. 

Question no. 26 

A system is made up of n components, including components A and B. 

(a) Show that if the components are connected in series, then the probability that there 
are exactly r components between A and B is given by 


2 (n — r — 1) 
(n — 1 )n 


for r G {0,1 ,..., n — 2}. 


(b) Calculate the probability that there are exactly r components between A and B if 
the components are connected in circle. 

(c) Suppose that n — 5 and that the components are connected in series. Calculate 
the probability of operation of the subsystem constituted of components A , B and the 
r components placed between them if the components operate independently of each 
other and all have a reliability of 0.95. 

Question no. 27 

A man owns a car and a motorcycle. Half the time, he uses his motorcycle to go to 
work. One-third of the time, he drives his car to work and, the remainder of the time, 
he uses public transportation. He gets home before 5:30 p.m. 75% of the time when he 
uses his motorcycle. This percentage is equal to 60% when he drives his car and to 2% 
when he uses public transportation. Calculate the probability that 

(a) he used public transportation if he got home after 5:30 p.m. on a given day; 

(b) he got home before 5:30 p.m. on two consecutive (independent) days and he used 
public transportation on exactly one of these two days. 
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Question no. 28 

In a certain airport, a shuttle coming from the city center stops at each of the four 
terminals to let passengers get off. Suppose that the probability that a given passenger 
gets off at a particular terminal is equal to 1/4. If there are 20 passengers using the 
shuttle and if they occupy seats numbered from 1 to 20 , what is the probability that 
the passengers sitting in seats nos. 1 to 4 all get off 

(a) at the same stop? 

(b) at different stops? 

Question no. 29 

A square grid consists of 289 points. A particle is at the center of the grid. Every 
second, it moves at random to one of the four nearest points from the one it occupies. 
When the particle arrives at the boundary of the grid, it is absorbed. 

(a) What is the probability that the particle is absorbed after eight seconds? 

(b) Let Ai be the event “the particle is at the center of the grid after i seconds.” 
Calculate P[Ai] (knowing that A$ is certain, by assumption). 

Question no. 30 

Five married couples bought 10 tickets for a concert. In how many ways can they 
sit (in the same row) if 

(a) the five men want to sit together? 

(b) the two spouses in each couple want to sit together? 

Multiple choice questions 


Question no. 1 

Two weeks prior to the most recent general election, a poll conducted among 1000 
voters revealed that 48% of them intended to vote for the party in power. A survey made 
after the election, among the same sample of voters, showed that 90% of the persons 
who indeed voted for the party in power intended to vote for this party, and 10% of 
those who voted for another party intended (two weeks prior to the election) to vote for 
the party in power. Let the events be 

A = “a voter, taken at random in the sample, intended to vote for the party in 
power;” 

B = “a voter, taken at random in the sample, voted for the party in power.” 

(A) From the statement of the problem, we can write that P [A] = 0.48 and that 

(a) P[AnB] = 0.9; P[An B'} = 0.1 

(b) P[B | A] = 0.9; P[B' \ A] = 0.1 

(c) P[A | B\ = 0.9; P[A \ B'} = 0.1 

(d) P [A' HB}= 0.9; P[An B'} = 0.1 

(e) P[A | B\ = 0.9; P[B' \ A] = 0.1 
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(B) The probability of event B is given by 

(a) 0.45 (b) 0.475 (c) 0.48 (d) 0.485 (e) 0.50 (f) 0.515 

(C) Are events A and B' incompatible? 

(a) yes (b) no (c) we cannot conclude from the information provided 

(D) Are events A and B' independent? 

(a) yes (b) no (c) we cannot conclude from the information provided 

(E) Do events A and B' form a partition of the sample space 42? 

(a) yes (b) no (c) we cannot conclude from the information provided 

(F) Let E be “a voter, taken at random among the 1000 voters polled, did not vote as 
he intended to two weeks prior to the election (in regard to the party in power).” We 
can write that 

(a) P [E] = P [A | B '] + P[A! \ B] 

(b) P [E] = P [A n B] + P [A' n B'} 

(c) P[E]=P [B' | A] + P [B | A'} 

(d) P [E] = P [A n B'} + P [A' n B '} 

(e) P [E] = P [A H B '] + P [A' n B} 

Question no. 2 

Let A and B be two events such that 

p [A n B] = p [A' n B] = p [A n B '] = p. 

Calculate P [A U B ]. 

(a) p (b) 2 p (c) 3 p (d) 3 p 2 (e) p 3 

Question no. 3 

We have nine electronic components, of which one is defective. Five components are 
taken at random to construct a system in series. What is the probability that the system 
does not operate? 

(a) 1/3 (b) 4/9 (c) 1/2 (d) 5/9 (e) 2/3 

Question no. 4 

Two dice are rolled simultaneously. If a sum of 7 or 11 is obtained, then a coin is 
tossed. How many elementary outcomes [of the form (diel, die2) or (diel, die2, coin)] 
are there in the sample space 42? 

(a) 28 (b) 30 (c) 36 (d) 44 (e) 72 

Question no. 5 

Let A and B be two independent events such that P [A] < P [B], P [A D B\ =6/25, 
and P [A | B] + P [B \ A\ = 1. Calculate P [A]. 

(a) 1/25 (b) 1/5 (c) 6/25 (d) 2/5 (e) 3/5 
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Question no. 6 

In a certain lottery, 4 balls are drawn at random and without replacement among 
25 balls numbered from 1 to 25. The player wins the grand prize if the 4 balls that 
she selected are drawn in the order indicated on her ticket. What is the probability of 
winning the grand prize? 

1 24 1 1 1 

^ 12,650 ^ 390,625 ^ 303,600 ^ 390,625 ^ 6,375,600 

Question no. 7 

New license plates are made up of three letters followed by three digits. If we suppose 
that the letters I and O are not used and that no plates bear the digits 000, how many 
different plates can there be? 

(a) 24 3 x 9 3 (b) (26 x 25 x 24) (10 x 9 x 8) (c) 24 3 x (10 x 9 x 8) 

(d) 24 3 x 999 (e) 25 3 x 9 3 


Question no. 8 

Let P[A\B\ = 1/2, P [ B'] = 1/3, and P[An B'} = 1/4. Calculate P [A]. 
(a) 1/4 (b) 1/3 (c) 5/12 (d) 1/2 (e) 7/12 


Question no. 9 

In the lottery known as 6/49, first 6 balls are drawn at random and without replace¬ 
ment among 49 balls numbered from 1 to 49. Next, a seventh ball (the bonus number) 
is drawn at random among the 43 remaining balls. A woman selected what she thinks 
would be the six winning numbers for the next draw. What is the probability that this 
woman actually did not select any of the seven balls that will be drawn (including the 
bonus number)? 


(a) 







Question no. 10 

In a quality control procedure, every electronic component manufactured is subjected 
to (at most) three tests. After the first test, an arbitrary component is classified as either 
“good,” “average,” or “defective,” and likewise after the second test. Finally, after the 
last test, the components are classified as either “good” or “defective.” As soon as a 
component is classified as defective after a test, it is returned to the factory for repair. 
The following random experiment is performed: a component is taken at random and 
the result of each test it is subjected to is recorded. How many elementary outcomes 
are there in the sample space i?? 


(a) 3 (b) 8 (c) 11 (d) 18 (e) 21 


Question no. 11 

Let P[A\ = 1/3, P[B\ = 1/2, P[C\ = 1/4, P[A \ B} = 1/2, P[B \ A] = 3/4, 
P[A\C] = 1/3, P[C\A\ = 1/4, and P[BnC] = 0. Calculate the probability 
P [A | BUC\. 
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(a) 0 (b) 1/3 (c) 4/9 (d) 5/6 (e) 1 

Question no. 12 

A fair die is rolled twice (independently). Consider the events 
A = “the two numbers obtained are different;” 

B = “the first number obtained is a 6;” 

C = “the two numbers obtained are even.” 

Which pairs of events are the only ones comprised of independent events? 

(a) no pairs (b) (A, B ) (c) (A, B) and (1?, C) (d) (A, B) and (A, C) 

(e) the three pairs 

Question no. 13 

A man plays a series of games for which the probability of winning a given game, 
from the second one, is equal to 3/4 if he won the previous game and to 1/4 otherwise. 
Calculate the probability that he wins games nos. 2 and 3 consecutively if the probability 
that he wins the first game is equal to 1/2. 

(a) 3/16 (b) 1/4 (c) 3/8 (d) 9/16 (e) 5/8 

Question no. 14 

A box contains two coins, one of them being fair but the other one having two 
“heads.” A coin is taken at random and is tossed twice, independently. Calculate the 
probability that the fair coin was selected if “heads” was obtained twice. 

(a) 1/5 (b) 1/4 (c) 1/3 (d) 1/2 (e) 3/5 
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Random variables 


The elements of a sample space may take diverse forms: real numbers, but also brands of 
components, colors, “good,” or “defective,” and so on. Because it is easier to work with 
real numbers, in this chapter we transform all the elementary outcomes into numerical 
values, by means of random variables. We consider the most important particular cases 
and define the main functions that characterize random variables. 


3.1 Introduction 

Definition 3.1.1. A random variable is a real-valued function defined on a sample 
space. 

Example 3.1.1. (i) Suppose that a coin is tossed. The function X that associates the 
number 1 with the outcome “heads” and the number 0 with the outcome “tails” is a 
random variable. 

(ii) Suppose now that a die is rolled. The function X that associates with each elemen¬ 
tary outcome the number obtained (so that X is the identity function in this case) is 
also a random variable. 

Example 3.1.2. Consider the random experiment that consists in observing the time 
T that a person must wait in line to use an automatic teller machine. The function T 
is a random variable. 

3.1.1 Discrete case 

Definition 3.1.2. A random variable is said to be of discrete type if the number of 
different values it can take is finite or countably infinite. 


M. Lefebvre, Basic Probability Theory with Applications, Springer Undergraduate Texts in Mathematics 
and Technology, DOI: 10.1007/978-0-387-74995-2_3, 

© Springer Science + Business Media, LLC 2009 
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Definition 3.1.3. The function px that associates with each possible value of the (dis¬ 
crete) random variable X the probability of this value is called the probability (mass) 
function of X. 

Let {a?i, X 2 ,...} be the set of possible values of the discrete random variable X. The 
function px has the following properties: 

(i) px{x k ) > 0 for all x k ; 

(ii) YX=iPx{x k ) = I- 


Example 3.1.1 (continued), (i) If the coin is fair (or unbiased, or well-balanced), we 
may write that 


X 

0 1 

PX (x) 

1/2 1/2 


(ii) Similarly, if the die is fair, then we have the following table: 


X 

1 2 3 4 5 6 

Px{x) 

1/6 1/6 1/6 1/6 1/6 1/6 


Definition 3.1.4. The function F x that associates with each real number x the proba¬ 
bility P[X < x\ that the random variable X takes on a value smaller than or equal to 
this number is called the distribution function of X. We have: 

F x{x) = ^2 Px(x k ). 

Xk<X 

Remark. The function Fx is nondecreasing and right-continuous. 


Example 3.1.1 (continued), (i) In the case of the coin, we easily find that (if 
P[{heads}] = 1/2) _ 


X 

0 1 

F x (x) 

1/2 1 


Remark. More completely, we may write that 

( 0 if x < 0, 

F x {x)= l 1/2 if 0 < a; < 1, 

[ 1 if x > 1 , 

where x is an arbitrary real number. 

(ii) If the die is well-balanced, then we deduce from the function px{%) the following 
table: _ 


X 

1 2 3 4 5 6 

F x (x) 

1/6 1/3 1/2 2/3 5/6 1 


(see Figure c .1) 
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Fig. 3.1. Distribution function of the random variable in Example 3.1.1 (ii). 


3.1.2 Continuous case 

Definition 3.1.5. A random variable that may take an uncountably infinite number of 
values is said to be of continuous type. 

Example 3.1.2 (continued). Because the set of possible values of the random variable 
T in Example 3.1.2 is the interval (0, oo), T is a continuous random variable. 

Remark. We assume in Example 3.1.2 that the person cannot arrive and use the ATM 
immediately, otherwise T would be a random variable of mixed type , that is, a variable 
that is discrete and continuous at the same time. We do not insist on this type of random 
variable in this textbook. 

Definition 3.1.6. The (probability) density function of a continuous random vari¬ 
able X is a function fx defined for all x E M and having the following properties: 

(i) fx(x) > 0 for any real number x; 

(ii) if A is any subset o/M, then 

P[X e A\ = f fx(x)dx. 

J A 

Remarks, (i) Because X is a real-valued function, so that it must assume some value in 
the interval (— 00 , 00 ), we can write that 

/ oo 

fx{x)dx. 

-OO 

(ii) The density function is different from the probability function px of a discrete 
random variable. Indeed, fx{%) does not give the probability that the random variable 
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X takes on the value x. Moreover, we may have that fx(x) > 1. Actually, we may write 


that 


fx(x)e ~ P 


x--<X<x+-, 


where e > 0 is small. Thus, fx{x)e is approximately equal to the probability that X 
takes on a value in an interval of length e about x. 


Definition 3.1.7. The distribution function Fx of a continuous random variable X 
is defined by 

F x (x) = P[X <x]= f f x (u) du. 

J — oo 

We deduce from this definition that 


P[X = x\ = P[x < X < x\ = P[X <x\- P[X < x ] 

/ ;X f- X ~ 

fx(u)du~ f x (u)du = 0 

-OO J — OO 

for any real number x, where x~ means that the range of the integral is the open interval 
(—oo, x). That is, before performing the random experiment, the probability of obtaining 
a particular value of a continuous random variable is equal to zero. Therefore, if we take 
a point at random in the interval [ 0 , 1 ], we may assert that the point that we will obtain 
did not have, a priori , any chance of being selected! 

We also deduce from the previous definition that 

f^F x (x) = f x (x) (3.1) 

for any x where Fx(x) is differentiable. 

Remarks, (i) If X is a continuous random variable, then its distribution function Fx is 
also continuous. However, a continuous function is not necessarily differentiable at all 
points. Furthermore, the density function of X may be discontinuous, as in the next 
example. Actually, fx is a piecewise continuous function, that is, a function having at 
most a finite number of jump discontinuities (see p. 2 ). We say that fx has a jump 
discontinuity at xq if both linx^^ fx(%) an d lim^t^o fx(%) exist, but are different. 

(ii) Every random variable X has a distribution function Fx • To simplify the presen¬ 
tation, we could theoretically define the density function fx as the derivative of Fx , 
whether A is a discrete, continuous, or mixed type random variable. We mentioned in 
the previous remark that when Fx is a continuous function, its derivative is a piecewise 
continuous function. However, in the case of a discrete random variable, the distribution 
function Fx is a step or staircase function , as in Figure 3.1. The derivative of a step 
function is equal to zero everywhere, except at the jump points Xk, k = 1 , 2 ,... . We 
can write that 
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—F x (x) = ^ ~^P[X = x k ]5(x - x k ), 
x k =1 

where P[X = xjf\ = lim^^ Fx{x) — lim^f^ Fx{x), and S(-) is the Dirac delta function 
(see p. 4). Similarly, the density function of a mixed type random variable involves 
the Dirac delta function. To avoid using this generalized function , the vast majority of 
authors prefer to consider the discrete and continuous random variables (and vectors) 
separately. In the discrete case, we define the probability mass function px(%k) = P[X = 
Xk], as we did above, rather than the density function. 

Example 3.1.3. Suppose that the waiting time (in minutes) to be served at a counter 
in a bank is a continuous random variable X having the density function (see Figure ^.2) | 



Fig. 3.2. Density function of the random variable in Example 3.1.3. 


( 0 if x < 0, 

fx(x) = < 1/2 if 0 < x < 1, 

( 3/(2a; 4 ) if x > 1. 

Note that the function fx is a valid density function, because fx(x) > 0 for all x and 


J™ fx(x)dx^£ l -dx + £ 


T da; = I_J_ 

2a; 4 2 2x 3 



Calculate (a) the distribution function of X and (b) the conditional probability 
P[X > 2 | X > 1], 
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Solution, (a) By definition, 

f x 

/ 0 du = 0 

J — oo 


F x (x) = < 


/>d>=f 
j£W 


0 du - 


2 u A 


du = 1 


(see Figure c .3) 


1 

2x 3 


if x < 0, 

if 0 < x < 1, 

if x > 1 



Fig. 3.3. Distribution function of the random variable in Example 3.1.3. 


(b) We seek 


P[X > 2 | X > 1] 


P[{X >2}n{x> 1 }] _ P[X > 2] 
P[X > 1] _ P[X > 1] 

1 -P[x< 2 ] = 1 — Fx(2) = 1-j 
1-P[X<1] l-Px(l) 1-| 


1 

8 ' 


Remark. Because X is a continuous random variable, P[X < x] = P[X < x] = Fx(x) 
for any real number x. It follows that P[X >2|X>l] = l/8as well. In general, we 
have: 


P[a < X < b\ = P[a < X < b\ = P[a < X < b\ = P[a < X < b} 


if X is a continuous random variable. 
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3.2 Important discrete random variables 

3.2.1 Binomial distribution 

Suppose that we perform repetitions of a certain random experiment and that we divide 
the set of possible outcomes into two mutually exclusive and exhaustive subsets: Bi n 
B 2 = 0 and B\ U B 2 = Q. That is, B\ and B 2 constitute a partition of the sample space 
Q (see p. 35). If the elementary outcome that occurs belongs to then we say that 
the experiment resulted in a success ; in the opposite case, it resulted in a failure. 

Definition 3.2.1. Let X be the (discrete) random variable that counts the number of 
successes obtained in n repetitions of a random experiment, where n is fixed. If 

(i) the probability p of success is constant for the n trials and 

(ii) the trials are independent, 

then we say that X has (or follows) a binomial distribution with parameters n and 
p. We write that X r>sj B(n,p). 

Remark. A parameter is a symbol that appears in the definition of a random variable 
and that can take different values. For example, in the case of the binomial distribution, 
n can take on the values 1,2,..., and p all the values in the interval (0,1). In practice, 
the parameter p is generally unknown. 

Now, suppose that, when we performed the n trials, we first obtained x consecutive 
successes S and then n — x consecutive failures F. By independence, we may write that 
the probability of this elementary outcome is 

P\SS ...S FF ...Ff \ = {P[S]} x {P[F]} n - x = p x {l-p) n ~ x . 

x times (n— x) times 


Hence, given that we can place the x successes among the n trials in (™) different ways, 
we deduce that the probability function of the random variable X ~ B (n,p) is given by 

px(x) = (fjp x q n ~ x for s = 0 , 1 ,.... n, 


where q := 1 — p. 

Remarks, (i) We have that px(x) > 0 for all x and, by Newton’s binomial formula , 

= E (jP x Q n ~ x = (P + Q) n = 1, 

x=0 x—0 


as should be. 
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(ii) The distribution function of X is 

F x (k) = Y,( n )p X Q n ~ X for k = 0,1,... ,n. 

x=0 '■ X ' 

There is no simple formula (without a summation symbol) for F X - To evaluate this 
function, we can use a pocket calculator or a statistical software package. Some values 
of the distribution function F x are given in Table B.l, page 276. 

(iii) The shape and t he po sition of the probability function p x depend on the parameters 
n and p (see Figure $.4). I 




Fig. 3.4. Probability functions of binomial random variables. 


(iv) If X has a binomial distribution with parameters n = 1 and p, we also say that X 
follows a Bernoulli distribution with parameter p. We thus have: 


Px(x) = p x (l - p) 1 x if a; = 0,1, 


f 1 — p if x = 0, 
\ p if x = 1 . 


Moreover, we can show that if the random variables Xi ,... ,X n are independent (see 
Section 4.1) and if they all have a Bernoulli distribution with parameter p, then 


n 


Finally, we say that a binomial random variable counts the number of successes in n 
Bernoulli trials , that is, in n independent trials for which the probability p of success is 
the same from trial to trial. 
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Example 3.2.1. In an airport, five radars are in operation and each radar has a 0.9 
probability of detecting an arriving airplane. The radars operate independently of each 
other. 

(a) Calculate the probability that an arriving airplane will be detected by at least four 
radars. 

(b) Knowing that at least three radars detected a given airplane, what is the probability 
that the five radars detected this airplane? 

(c) What is the smallest number n of radars that must be installed if we want an arriving 
airplane to be detected by at least one radar with probability 0.9999? 

Solution. Let X be the number of radars that detect the airplane. 

(a) We have that X ~ B (n = 5 ,p = 0.9). We seek 

P[X > 4] = (^ (0.9) 4 (0.1) + (0.9 ) 5 = (0.9) 4 [5 x (0.1) + 0.9] ~ 0.9185. 

Remark. Let Y be the number of radars that do not detect the arriving airplane. We 
find in Table B.l, page 276, that 

P[X > 4] = P[Y < 1 ] ~ 0.9185. 

(b) We want 


P[X = 5 | X > 3] 


P[{X = 5} n {X > 3}] _ P[X = 5] 
P[x > 3] _ P[X > 3] 


(a) P[Y = 0] TaU B.l 0.5905 
= P[Y < 2] “ 0.9914 “ ' ' 


(c) We now have that X ~ B(n,p = 0.9) and we seek (the smallest) n such that P[X > 1] 
= 0.9999. We have: 

P[X > 1] = 1 - P[X = 0] = 1 - (”) (0.9)°(0.1)" _ ° = 1 - (0.1)" 

= 0.9999 (0.1)" = 0.0001 = (0.1) 4 . 

Thus, we may write that n m i n = 4. 

Remark. Note that, n being a positive integer , we cannot, in general, find a value of 
n for which the probability requested is exactly equal to a given number p. We must 
rather find the smallest n for which the probability of the event in question is greater 
than or equal to p. For instance, here if we had required the probability of detecting the 
airplane to be (at least) 0.9995, then the answer would have been the same: n m i n = 4. 
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3.2.2 Geometric and negative binomial distributions 

Definition 3.2.2. Let X be the random variable that counts the number of Bernoulli 
trials needed to obtain a first success. We say that X has a geometric distribution 
with parameter p, where p is the probability of success on any trial. We write that X ~ 
Geo(p) [or Geom(p)]. 


We have: 


Px(x) = P[FF...F S} = {P[F]}*" : l P[S] = q x ~ 1 p 

^ V* * 


(x — 1) times 

for x = 1, 2 ,... . We observe that the function px is strictly decreasing (see Figure £H3 


P X M 


0.25 


0.05 


l 


2 3 4 5 6 7 - x 

p = 0.25 


Fig. 3.5. Probability function of a geometric random variable. 


Remarks, (i) We have that px (x) > 0 for all x and 



so that the function px is a valid probability function, 
(ii) The distribution function of X is given by 



for x = 1 , 2 , ... . Note that we then deduce that P[X > x] = q x . 
(iii) Making use of the formula P[X > x\ = q x , we can show that 


P[X > x + y \ X > x\ = P[X > y\ for any x, y G {0,1,2, . ..}. 
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This property is known as the memoryless property of the geometric distribution. 

Remark. The geometric distribution is sometimes defined as the number of Bernoulli 
trials performed before the first success occurs. 

Definition 3.2.3. Let X be the random variable that counts the number of Bernoulli 
trials performed until the rth success occurs, where r = 1,2,... . We say that X has 
a negative binomial distribution with parameters r and p. We write that X ~ 
NB(r , p). 

Note that the geometric distribution is the particular case of the negative binomial 
distribution obtained with r = 1. We get the probability function of X as follows: 

p x (x) = P[ P^P S_^_S]pP[ P^P SF } 

( x—r ) times r times (x—r—\) times (r—1) times 

+ --- + P[ S^_s F^£ S} 

(r—1) times (x—r) times 

^jp r { 1 — p) x ~ r for x = r, r + 1,... . (3.2) 

by independence and incompatibility, because there are different ways of placing 

the r — 1 successes among the first x — 1 trials (the arth trial being necessarily a success). 

Remarks, (i) The negative binomial distribution is also known as the Pascal distribution. 

(ii) As in the case of the binomial distribution, the shape and the position of the function 
px vary according to the values taken by the parameters r and p. 

(iii) Notice the difference between the binomial and the negative binomial distributions: 
a binomial random variable counts the number of successes in a fixed number (n) of 
trials , whereas in the case of the negative binomial distribution, the random variable 
denotes the number of trials required to obtain a fixed number (r) of successes. If X ^ 
NB(r,p), then we can write that 

P[X = x\ = P[B(x — l,p) = r — 1] p. 

Moreover, we can show the following relation between the two distributions: 

P[NB(r,p) < x\ = P[B(r + x,p) > r\. 


fx — 1 
\r- 1 


Example 3.2.2. A man shoots at a target until he has hit it twice. Suppose that the 
probability that a given shot hits the target is equal to 0.8. What is the probability that 
the man must shoot exactly four times? 


66 


3 Random variables 


Solution. Let X be the number of shots needed to end the random experiment. Then, if 
we assume that the shots are independent, we may write that X ~ NB (r = 2,p = 0.8). 
We seek 

P[X = 4] = px (4) = (/j (0.8) 2 (1 - 0.8) 2 = 3 x (0.64)(0.04) = 0.0768. 

Remark. If the man stops shooting as soon as he hits the target, then X ~ Geo (p = 
0.8) and 

P[X = 4] = (1 — 0.8) 3 (0.8) = 0.0064. 


3.2.3 Hypergeometric distribution 


Suppose that we perform n repetitions of a random experiment, but that the proba¬ 
bility of success varies from one repetition to another. For example, we take without 
replacement n objects in a lot of finite size TV, and we count the number of defective 
objects obtained. In this case, we cannot use the binomial distribution. Suppose that d 
objects, among the TV, are defective (or possess a certain characteristic). Let X be the 
number of defective objects obtained among the n drawn. We have: 


Px{x) 


d\ (N - d 
x) \n — x 
TV 7 

n 


for x = 0, 1 ,..., n. 


(3.3) 


Indeed, there are (^) different (equiprobable) samples of size n and, among them, there 
are (^) • (n-x) exactly x defective and n — x nondefective objects. 

Remark. We have that (^) = 0 if k < 0 or k > n. 


Definition 3.2.4. We say that the random variable X whose probability function is 
given by Formula (3.3) follows a hypergeometric distribution with parameters N, 
n, and d. We write that X Hyp(N,n,d). 


We must have that TV E {1,2,...}, n E {(0), 1,2,..., TV}, and d E {(0), 1,..., TV}. 
Moreover, if the size TV of the lot is large in comparison to the size n of the sample, then 
the fact of taking the objects without replacement will not influence much the probability 
of getting a defective object from draw to draw. That is, it is almost as if the objects 
were taken with replacement. Now, in that case, we may write that X ~ B (n,p = 
d/TV). Hence, we deduce that the binomial distribution can be used to approximate 
the hypergeometric distribution. To be more precise, we can show that if d and TV 
tend to infinity in such a way that d/TV converges to p, but n is kept constant, then the 
distribution of X ~ Hyp(TV, n, d) tends to that of a B(n,p) random variable. In practice, 
the approximation obtained should be good when n/N < 0.1. 
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Remark. The quantity n/N is called the sampling fraction. 

Example 3.2.3. Lots containing 25 devices each are subjected to the following sampling 
plan: a sample of 5 devices is taken at random and without replacement, and the lot 
is accepted if and only if the sample contains less than 3 defective devices. Calculate, 
supposing that there are exactly 4 defective devices in a particular lot, 

(a) the probability that this lot is accepted; 

(b) an approximation of the probability calculated in (a) with the help of a binomial 
distribution. 

Solution. Let X be the number of defective devices in the sample. We have that X ~ 
Hyp(TV = 25, n = 5, d = 4). 

(a) Let F be the event “the lot is accepted.” We seek 

2 f 4 j • ( 21 j 
P[F] = P[X<2]=Y j Xr 

x=0 l5 ) 

~ 0.3830 + 0.4506 + 0.1502 ~ 0.984. 


(b) We can write that 


P[X < 2] ~ P[Y < 2], where Y ~ B(n = 5,p = 4/25) 

= Yi ( 5 ) (4/25f (21/25) 5-y ~ 0.4182 + 0.3983 + 0.1517 ~ 0.968. 
y =o ' y ' 


Note that here we have that n/N = 5/25 = 0.2, which is greater than 0.1. Therefore, we 
did not expect the approximation to be very good. If we replace TV = 25 by TV = 100, 
so that n/N = 0.05, we obtain that 


P [Hyp (TV = 100, n = 5, d = 4) < 2] ~ 0.9998 


and 

P[B(n = 5 ,p = 4/100) < 2] ~ 0.9994, 

which is a better approximation. Finally, if we keep the same ratio d/N = 4/25 = 0.16, 
but if we assume that TV = 100 and d— 16, then we calculate 

P[Hyp(TV = 100, n = 5, d = 16) < 2] ~ 0.9720, 

and with TV = 200 and d = 32, we find that 

P[Hyp(TV = 200, n = 5, d = 32) < 2] ~ 0.9700. 

Notice that the quality of the approximation P[X < 2] ~ 0.968 obtained with a binomial 
distribution increases with increasing TV, even though d/N is always equal to 0.16. 
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3.2.4 Poisson distribution and process 

Let X be a random variable having a binomial distribution with parameters n and p. We 
can show that if n tends to infinity and p decreases to 0, in such a way that the product 
np remains equal to the constant A, then the probability function of X converges to the 
function px{%) given by 


e ^\ x 

Px(x) = -j— for x = 0,1,... . 

x\ 


(3.4) 


Definition 3.2.5. We say that the discrete random variable X whose probability func¬ 
tion is given by Formula (3.4) has a Poisson distribution with parameter A > 0. We 
write that X Poi{ A). 


Remarks, (i) Making use of the formula 


= 1 + * + ¥ 


we easily show that the function defined in (3.4) is a valid probability function. 

(ii) The Greek letter a is also often used for the parameter of the Poisson distribution. 
In statistics , we write 6 to designate an arbitrary parameter of a random variable. 


(iii) The shape of the probability function px depends on the value of the parameter A. 

(iv) To evaluate the distribution function of a Poisson random variable, we can use 
a pocket calculator or a statistical software package. Table B.2, page 278, gives many 
values of this function. 


(v) We deduce from what precedes that we can use a Poisson distribution with parameter 
A = np to approximate the binomial distribution with parameters n and p. In general, 
the Poisson approximation should be good if n > 20 and p < 0.05. If the value of p 
is greater than 1/2, then we must consider the number of failures (with probability 
1 — p < 1/2) rather than the number of successes before performing the approximation 
by the Poisson distribution. 


Example 3.2.4. A new type of brakes is being studied. The company manufacturing 
these brakes claims that they could last at least 100,000 km for 90% of the vehicles that 
will use them. A laboratory simulated the driving of 100 cars using these brakes. Let X 
be the number of cars whose brakes will not last 100,000 km. 

(a) What distribution does X follow? 

(b) We will doubt the claimed percentage if the brakes must be changed on 17 cars or 
more before 100,000 km. What is, approximately, the probability of observing this event 
if, in fact, the claimed percentage is exact? 
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Solution, (a) By definition, if we assume that the cars are independent, we may write 
that X ~ B(n = 100, p = 0.1). 

(b) We want P[X > 17]. We have that P[X > 17] - P[Y > 17], where Y ~ Poi(A = 
100(0.1) = 10). We then find in Table B.2, page 278, that 

P[Y > 17] = 1 - P[Y < 16] ~ 1 — 0.9730 = 0.0270. 

Remark. Here, we have that n = 100 and p = 0.1. The value of n is very large, which is 
preferable, but that of p is slightly too large to expect a good approximation. Actually, 
we find that P[X > 17] ~ 0.021. 

Poisson process 

Suppose that the random variable N(t) denotes the number of events that will occur 
in the interval [0, t]. For instance, we can be interested in the number of failures of a 
machine, or in the number of customers, or in the number of telephone calls in the 
interval [0, t\. If we make the following assumptions: 

(i) N(0) = 0; 

(ii) the value of TV (£ 4 ) — TV (£ 3 ) is independent of the value taken by TV(£ 2 ) — TV(£i) if 
0 < t% < £ 2 < £3 < £ 4 ; 

(iii) TV(£ + s) — TV(£) ~ Poi(As), for s, £ > 0, 

then the set {TV(£),£ > 0} of random variables is called a Poisson process with rate 
A > 0 . 

Remarks, (i) Condition (ii) above means that what happens in two disjoint intervals is 
independent. Furthermore, condition (iii) implies that the distribution of the number of 
events in an arbitrary interval depends only on the length of this interval. We say that 
the Poisson process has independent and stationary increments , respectively. 

(ii) Conditions (i) and (iii) imply that TV(£) = TV(£ + 0) — N(0) ~ Poi(At). 

(iii) The Poisson process is a very important particular case of what is known as a 
stochastic or random process. A stochastic process is a set {X(t),t G T} of random 
variables X(t) (see Chapter 6 ). The set T is a subset of M. In the case of the Poisson 
process, we take T = [0, oo). For every particular value to of £, we get a random variable 
TV (to) having a Poisson distribution with parameter A to- It is important to distinguish 
between the random variable N(t) and the random process {N(t),t > 0 }. 

(iv) The Poisson process is a particular continuous-time Markov chain , and is used 
abundantly in communication theory and in queueing theory , which is the subject of 
Chapter 6 . 

Example 3.2.5. Telephone calls arrive at an exchange according to a Poisson process 
with rate A = 2 per minute (i.e., calls arrive at the average rate of 2 per minute, 
according to a Poisson distribution). Calculate the probability that exactly 2 calls will 
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be received during each of the first 5 minutes of a given hour. What is the probability 
that exactly 10 calls will be received during the first 5 minutes of the hour in question? 

Solution. Let N(l) be the number of calls received during a 1-minute period. We can 
write that N(l) ~ Poi (2 • 1 ). We first calculate 

P[N( 1) = 2] = P[ Poi( 2 ) = 2 ] = = 2 e“ 2 . 

Next, let M be the number of minutes, among the 5 minutes considered, during 
which exactly 2 calls will be received. By independence, we may write that M follows a 
binomial distribution with parameters n = 5 and p = 2e -2 . We seek 

P[M = 5] = Qj (2e _2 ) 5 (l - 2e -2 ) 5-5 = 32e" 10 ~ 0.00145. 

To obtain the probability that exactly 10 calls will be received during the first 5 
minutes of the hour considered, we calculate 

- 10 10 10 

P[N( 5) = 10 ] = P[ Poi (2 • 5) = 10 ] = ——— ~ 0.1251. 


3.3 Important continuous random variables 

3.3.1 Normal distribution 


Definition 3.3.1. Let X be a continuous random variable that can take any real value. 
If its density function is given by 


fx(x) 



{ 


( x — m ) 2 ] 

2a 2 J 


for — oo < x < oo, 


then we say that X has a normal (or Gaussian^) distribution with parameters fi and 
cf 2 , where /iGl and a > 0. We write that X 


Remark. The parameter fi is actually equal to the mean of X, and a is the standard 
deviation of X (see Section 3.5). Furthermore, the standard deviation of a random 
variable is the square root of the variance of this variable. Therefore, in the case of the 
normal distribution, a 2 is its variance. 


The normal distribution is the most important continuous distribution, largely be¬ 
cause of the central limit theorem , which is stated in Chapter 4. Moreover, all no rmal 


distributions have the same general shape, namely that of a bell (see Figure 3.6) 
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Fig. 3.6. Density function of a normal random variable. 


The functions fx(%) are symmetrical with respect to the parameter p. That is, 
fx{l*> — x) = fx(l^ + x ) for all xgM. 

The points p — a and p + a are those where the function fx changes its direction of 
concavity. Finally, the larger a is, the more flattened the curve is. Conversely, if a is 
small, then the curve is concentrated around the mean p. 

Now, let X ~ N(/i, a 2 ). We can show (see Section 3.4) that if we define the random 
variable Z = (X — p)/o , then Z ^ N(0,1). The notation Z is used, in general, for the 
N(0,1) distribution. Its density function is often denoted by (f)(z). 

Remark. The N(0,1) distribution is called the standard or unit normal distribution. 

The main values of the distribution function of the N(0,1) distribution, denoted by 
@(z), are presented in Table 13.3, fc age 279. With the help of this table, we can calculate 
the probability P[a < X < b] for any normal distribution. The table gives the value of 
@(z) for positive z. By symmetry, we may write that @(—z) = 1 — @(z). 

If we lo ok fo r the number a for which P[X < a] = p > 1 / 2 , we first find the number 
z in Table ’3.3 lhat corresponds to this probability p (sometimes we must interpolate in 
the table), and then we set 

a = fi + z • a. 

If p < 1 / 2 , the formula becomes (by symmetry) 


[i 


z • a. 


Finally, the numbers b that correspond to the main values of the pr obab ility p := 
P[X > 6], for instance, p = 0.05, p = 0.01, and so on, are given in Table B.4, page 280. 
Note that these numbers can be written as follows: 


b = $ 1 (l-p) = Q 1 (p), 
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where Q(z) := 1 — @(z) and Q 1 is the inverse function of Q. 


Example 3.3.1. Suppose that the compressive strength X (in pounds per square inch) 
of a certain type of concrete has a normal distribution with parameters n = 4200 and 
a 2 = (400) 2 . 

(a) Calculate the probability P[3000 < X < 4500]. 

(b) Solve the following equations: (i) P[X < a] = 0.95; (ii) P[X > b] = 0.9. 

Remark. In fact, the model proposed for the compressive strength of the concrete cannot 
be the exact model, because a normal distribution can take on any real value, whereas 
the compressive strength cannot be negative. Nevertheless, the model in question may 
be a good approximation to the true (unknown) model. Furthermore, we find that the 
probability that a random variable having a N(/i,a 2 ) distribution will take on a value 
in the interval [p — 3a, fi + 3cr] is greater than 99.7%. 


Solution, (a) We have: 


P[3000 < X < 4500] 


r 3000 - 4200 ^X- 4200 ^ 4500 - 4200 
[_ 400 “ 400 “ 400 

P[— 3 < Z < 0.75] = 0(0.75) - 0(-3) 


Tab ~ B ' 3 0.7734 - 0.0013 = 0.7721. 


page 280). Hence, we may wri 


(b) (i) We find in Table 3.3, page 279, that P[Z < 1.645] ~ 0.95 (see also Table B.4, 


e that 


a ~ 4200 + (1.645)(400) = 4858. 

ii) We have that P[X > b\ = 0.9 P[X < 5] = 0.1. Next, we may write that 


P[X < b] = 0.1 


z< 


b — 4200 


6-4200 \ 
400 ) 


400 
= 0.9. 


= 0.1 


< 2 > 


b — 4200 \ 
400 ) 


= 0.1 


Finally, we find in Table B.4 that the value that corresponds to Q 1 (0.1) is approxi¬ 
mately equal to 1.282. Because, by symmetry, Q -1 (0.9) = — Q -1 (0.1), it follows that 


b ~ 4200 + (—1.282)(400) = 3687.2. 


Making use of the central limit theorem (see Chapter 4), we can show that, if n is 
large enough, we can use a normal distribution to approximate the binomial distribution 
with parameters n and p. Let X ~ B (n,p). The de Moivre-Laplace approximation is 
the following: 
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P[X = x] ~ fy(x) for x = 0,1,..., n, 

where Y ~ N (np,npq). 

Remarks, (i) Thus, we replace a probability at a point by the value of a density function 
evaluated at this point. Recall that fy(x) is not the probability that Y takes on the 
value x. Indeed, because Y is a continuous random variable, we know that P[Y = x] =0 
for all x eR. 

(ii) The mean of the binomial distribution is given by np, and its variance by np( 1 — p) 
(see Section 3.5). It is therefore logical to choose piy = np and — npq. 

When we want to evaluate (approximately) a probability of the form P[a < X < b] 
with the help of a normal distribution, we use the following formula: 


P[a<X <b\~P 


Z < 


bp 0.5 — np 

Vnpq 


Z < 


a — 0.5 — np 


y/rvpq 


(3.5) 


where a and b are integers. 

Remarks, (i) To approximate the probability P[a < X < 6], we must write that 


P[a < X < b\ = P[a + 1 < X < b — 1] 

b — 0.5 — np 


(3.6) 


~ P 


Z < 


y/npq 


-P 


Z < 


a + 0.5 — np 

Vnpq 


(ii) The term 0.5 in Formula (3.5) [and (3.6)] is a continuity correction factor that most 
authors recommend using, because we replace a discrete distribution by a continuous 
distribution. 

(iii) The approximation obtained should be good if np > 5 whenp < 0.5, or if n(l— p) > 5 
when p > 0.5. The normal distribution being symmetrical, it is easier to approximate 
the distribution of X ~ B(n,p) with p ~ 1/2 than to approximate the distribution of 
a binomial random variable having a very small or very large parameter p. Actually, 
in that case, we should use the Poisson approximation that we saw in the preceding 
section. 


Example 3.3.2. A manufacturing process produces 10% defective items. A sample of 
200 items is drawn at random. Let X be the number of defective items in the sample. 
Use a normal distribution to calculate (approximately) the probability P[X = 20]. 

Solution. We may assume (see the first remark below) that the random variable X has 
a B(n = 200, p = 0.1) distribution. So, we have that np = 20 > 5. We set 

P[X = 20] f Y (20), where Y ~ N(20,18) 

1 f (20 - 20) 2 

_ vWT8 exp r 


2 • 18 


~ 0.0904. 
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We can also proceed as follows: 


P[X = 20 ] = P [20 < X < 20 ] 

20 + 0.5-20 


~ P 


Z < 


Vl8 


- P 


z < 


20-0.5-20 

. 


~ #(0.12) - #(-0.12) Tab ~ B ' 3 2(0.5478) - 1 = 0.0956. 


Remarks, (i) In the statement of the problem, it is not specified whether the sample 
is taken with or without replacement. At any rate, in order to use the hypergeometric 
distribution, we need the value of the size N of the lot, which is not specified either. 
Therefore, we must assume that all the items in the sample have a 0.1 probability of 
being defective, independently from one item to the other. 

(ii) The exact value, obtained by using the binomial distribution, is 

P[X = 20] = f 2 2 ° 0 °Vo.l) 20 (0.9) 180 ~ 0.0936. 


(iii) The Poisson approximation should work well in this example, because n is very 
large and p is relatively small. We find that 

20 20 

P[X = 20] ~ P[Poi(A = 20) = 20] = e “ 20 — - 0.0888. 

Thus, the normal approximation is actually better in this particular example. 


3.3.2 Gamma distribution 

Definition 3.3.2. The gamma function, denoted by T, is defined by 

rOO 

T{u ) = / x u ~ 1 e~ x dx for u > 0. (3.7) 

Jo 

We can show that T(u) = (u — l)T(u — 1) if u > 1. Because we find directly that 
P(l) = 1, we may write that 

T(u) = (u — 1)! if u G {1, 2,...}. 

Moreover, we have that P(l/2) = r. 

Definition 3.3.3. Let X be a continuous random variable whose density function is 
given by 

fx(x) = for x > 0. 

P(o) 

We say that X has a gamma distribution with parameters a > 0 and A > 0. We 
write that X ~ G(o, A). 
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Remark. The parameter A is a scale parameter, whereas cu is a shape parameter. The 
shape o f the density function fx changes a lot with <a, when a is relatively small (see 
Figure 3.7). When a becomes large, fx(%) tends to a normal density, which is a conse¬ 
quence of the central limit theorem because, when a is an integer, the random variable 
X can be represented as a sum of a random variables (see the next remark). When 
0 < a < 1, the function fx(%) tends to infinity as x decreases to 0. 



Fig. 3.7. Density functions of various random variables having a gamma distribution with 
A = 1. 


In general, we cannot give a simple formula, that is, without an integration sign 
in it, for the distribution function of a random variable having a gamma distribution. 
However, if the parameter a is a (positive) integer n, we can show that 

P[G(n, A) < x\ = P[Poi(Ax) > n\ = 1 — P[Poi(Ax) < n — 1]. 

Note that this formula enables us to express the distribution function of a continuous 
random variable X ~ G(n, A) in terms of that of a discrete random variable Y ~ Poi(Ax): 


F x (x) = l-F y (n-1). 


Particular cases 

(i) If a is a natural number, then we also say that X follows an Erlang distribution , 
which is important in queueing theory. 

(ii) If a = n/2, where n G {1,2,...}, and A = 1/2, then the gamma distribution is also 
known as the chi-square distribution with n degrees of freedom. We write that X ~ Xn- 
This distribution is very useful in statistics. 

(iii) If a = 1, then the density function fx becomes 
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fx{x) = Ae Ax for x > 0. 


We say that X has an exponential distribution with parameter A > 0. We write that 
X ~ Exp(A). 

Remark. If Xi,... ,X n are independent random variables and if X{ ~ Exp(A), for i = 
1,..., n, then Y := XlILi ^ follows a gamma distribution with parameters n and A 
(see Chapter 4). 

Suppose now that {X(£), t > 0} is a Poisson process with rate A. Let T be the arrival 
time of the first event of the process. We may write that 


P[T >t} = P[N(t) = 0 ] = Ppoi(Ai) = 0 ] = e~ xt 


It follows that 


fr(t) = — ^P[T > t\ = Ae Xt for t > 0. 


Thus, we may assert that the random variable T has an exponential distribution with 
parameter A. Using the above remark and the properties of the Poisson process, we may 
also assert, more generally, that the time needed to obtain n events (from any time 
instant) has a G(n, A) distribution. This result enables us to justify Formula (3.3.2). 

Remark. We can show that the exponential distribution, just as the geometric distribu¬ 
tion, possesses the memoryless property. That is, if X ~ Exp(A), then 


P[X >t + s\X >t\= P[X > s] for s,t> 0. 


Actually, only the geometric and exponential distributions possess this memoryless prop¬ 
erty. Furthermore, in the case of the geometric distribution, the property is only valid 
for s and t G { 0 ,1,2 ,...}. 

Example 3.3.3. The lifetime (in years) of a radio has an exponential distribution with 
parameter A = 1 / 10 . If we buy a five-year-old radio, what is the probability that it will 
work for less than 10 additional years? 

Solution. Let X be the total lifetime of the radio. We have that X ~ Exp(A = 1/10). 
We seek 


P[X < 15 | X > 5] = 1 - P[X > 15 | X > 5] = 1 - P[X > 10 ] = P[X < 10 ] 



Because of its memoryless property, the exponential distribution is widely used in re¬ 
liability. This property implies that the failure rate of a device is constant over time. The 
exponential distribution appears in the theory of stochastic processes and in queueing 
theory as well. 
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An extension of the exponential distribution to the entire real line is obtained by 
defining 

fx{x) = for —oo < x < oo, 

where A is a positive constant. We say that the random variable X has a double expo¬ 
nential distribution , or a Laplace distribution , with parameter A. 


3.3.3 Weibull distribution 


Definition 3.3.4. Let X be a continuous random variable whose density function is of 
the form 

fx(x) = XfdxP- 1 exp {-Xx 13 } for x > 0. 

We say that X follows a Weibull distribution with parameters A > 0 and (3 > 0. We 
write that X ~ W(\,/3). 

The Weibull distribution generalizes the exponential distribution, which is obtained 
by taking /? = 1. It is important in reliability. Like the gamma distribution, it can be 
used in numerous applications because of the various shapes taken by its density function 
depending on the values given to the parameter /?. It is also one of the distributions 
known as extreme value distributions. These distributions are used to model phenomena 
that occur very rarely, such as extremely cold or hot temperatures, exceptional floods 
of rivers, and so on. 


Example 3.3.4. Let T denote the temperature (in degrees Celsius) in a certain city 
during the month of July. Suppose that 

X := T - 30 | {T > 30} - W( 0.8, 0.5). 


That is, given that the temperature is above 30 degrees, it has a Weibull distribution 
with parameters A = 0.8 and /3 = 0.5. Thus, 

fx(x) = 0.4x -1 / 2 exp j— 0.8X 1 / 2 j for x > 0 


(see Figure $.8)1 Notice that the function fx(x) diverges when x decreases to 0. 

Using the results in Section 3.5, we find that the average temperature in this city 
(in July), when it exceeds 30°C, is equal to 33.125°C. We obtain the same value if we 
suppose that X is exponentially dis tribu ted with parameter A = 1/3.125 = 0.32 instead. 
However, as can be seen in Figure $.9, t he Weibull distribution W(0.8, 0.5) goes to zero 
more slowly than the Exp(0.32) distribution does. Therefore, extreme temperatures 
(above approximately 42° C in this particular example) are more likely. 
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Fig. 3.8. Probability density function of a W(0.8,0.5) random variable. 

We have: 

P[T > 35 | T > 30] = P[X > 5] = J 0.4^“ 1/2 exp |-0.8^ 1/2 | dx ~ 0.1672. 

3.3.4 Beta distribution 

Definition 3.3.5. Let X be a continuous random variable whose density function is 
given by 

fx(x) = mm x °-‘ >1 ~ x}3 -' /or0<1<1 ’ 



Fig. 3.9. Probability density functions of a W(0.8, 0.5) (continuous line) and an Exp(0.32) 
(broken line) random variables in the interval [5,20]. 
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where a > 0 and /3 > 0. We say that X has a beta distribution with parameters a 
and (3. We write that X ~ Be(a,/3). 

If X rsj Be(<a, /?) and Y := a + (b — a)X, where a < b, Y is said to have a generalized 
beta distribution. 

Particular case 

Let X rsj Be(<a, /?). If a = /3 = 1, then we have: 

fx(x) = 1 for 0 < x < 1 . 

We say that the continuous random variable X follows a uniform distribution on the 
interval (0,1) . We write that X ~ U(0,1). In general, we have that X ~ U(a, b) if (see 
Figure j.lO) 

fx(x) = 7- for a < x < b. 

b — a 



Fig. 3.10. Probability density function of a uniform random variable on the interval (a, b). 


Remarks, (i) This density function is obtained, for example, when a point is taken at 
random in the interval (a, b). Because the probability that the selected point is close 
to x, where a < x < 6 , is the same for all x, the function fx(x) must be constant in 
the interval (a, b). Note that a random variable having a uniform distribution on the 
interval (a, b) also has a generalized beta distribution with parameters a = /? = 1 . 

(ii) We can show that if exactly one event of a Poisson process took place in the interval 
(0, t], then the time instant T\ at which this event occurred has a uniform distribution 
on (0, t]. That is, 


Ti | {N(t) = 1} ~ U( 0 , t\. 
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Example 3.3.5. A point is taken at random on a line segment of length L. What is the 
probability that the length of the shorter segment divided by the length of the longer 
one is smaller than 1/4? 


Solution. Suppose, without loss of generality, that the segment starts at 0. Let X be 
the point selected. Then, X rsj U[0, L\. 

(i) If X G [0,L/2], then we must have: 


X 1 
< - 


L-X 4 

(ii) If X G (L/2,L], then 

L-X 1 
X < 4 


We have: 


AX < L-X 

AL — 4X < X 

r L/5 


X < 


X > 


<ix = ^ = i 
-0 L 5 


Likewise, 


P[X < L/ 5] = f — 

P[X > 4L/5] = [ L -dx= - - = 1 

J4L/5 ^ ^ 0 


L 

5' 


AL 

T' 


Thus, the probability requested is equal to i + i = 2. 

Remarks, (i) By symmetry, we could have considered either of the two cases and mul¬ 
tiplied the probability obtained by 2 . 

(ii) As can be observed in this example, the probability that the uniform random variable 
X takes on a value in a given subinterval depends only on the length of this subinterval. 


3.3.5 Lognormal distribution 

Definition 3.3.6. Let X be a continuous random variable taking only positive values. 
If Y := lnX follows a a 2 ) distribution, then we say that X has a lognormal 
distribution with parameters fi and a 2 . We write that X ~ TX(/i,cr 2 ). The density 
function of X is given by 

n / \ 1 f (Anx — /i) 2 1 „ 

= -2^"} /0rX> °' 

where /iGt and a > 0. 

Remark. In many situations, the lognormal distribution may be a more realistic model 
than the normal distribution, because it is always positive. For instance, the weight of 
manufactured items could have a lognormal distribution. 
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Example 3.3.6. Let X ~ LN(5,4). Calculate P[X < 100]. 


Solution. We have: 


P[X < 100] = P[lnX < In 100] = P[Y < In 100], where Y ~ N(5,4) 
In 100 — 5 


= P 


Z < 


~ <Z>(-0.2) Tab - B ‘ 3 0.4207. 


3.4 Functions of random variables 

Because a random variable is a real-valued function and the composition of two functions 
is another function, we can assert that if X is a random variable, then Y := g(X), where 
g is a real-valued function defined on the real line, is a random variable as well. In this 
section, we show how to obtain the probability function or the density function of Y. 


3.4.1 Discrete case 

Because a function g associates a single real number with each possible value of the 
random variable X, we can assert that if X is a variable of discrete type, then Y = g(X) 
will also be a discrete random variable, whatever the function g is. Indeed, Y cannot 
take on more different values than X. To obtain the probability function of Y , we apply 
the transformation g to each possible value of X and we add the probabilities of all 
values x of the random variable X that correspond to the same y. 

Example 3.4.1. Let X be a discrete random variable whose probability function is 
given by 


X 

-1 0 1 

Px{x) 

1/4 1/4 1/2 


We define Y = 2X. Because the function g : x —» 2x is bijective [i.e., to a given 
y = g(x) — 2x, there corresponds one and only one x and vice versa], the number of 
possible values of the random variable Y will be the same as the number of possible 
values of X. We find that _ 


y 

-2 0 2 

X 

Pv{y) 

1/4 1/4 1/2 

1 


Now, let W = X 2 . Because to two values of X, namely —1 and 1, there corresponds 
the same value w = 1, we must add px{~ 1) and px{ 1) to obtain pw {!)• We thus have: 


w 

0 1 

X 

pw(w) 

1/4 3/4 

1 
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In the case when the random variable X can take a (countably) infinite number of 
values, we apply the transformation to an arbitrary value of X, and we try to find a 
general formula for the function Py(v)- 

Example 3.4.2. Let X ~ Geo (p) and Y = X 2 . Because I G {1,2 ,..the quadratic 
function is (here) a bijective transformation and we easily calculate 


Py(v) = Px(Vv) = ^ 1 P for y = 1,4,9,... . 


3.4.2 Continuous case 

The composition of two continuous functions is another continuous function. Conse¬ 
quently, if X is a continuous random variable and if g is a continuous function, then 
Y := g(X) is a continuous random variable as well. In this book, we consider only the 
case when the function g is bijective. In that case, the inverse function g~ 1 (y) exists and 
we can use the following proposition to obtain the density function of the new random 
variable Y. 

Proposition 3.4.1. Suppose that the equation y = g(x) has a unique solution: x = 
g~ x {y). Then, the density function ofY is given by the formula 


fY(y) = fx[g Hy)] fg 1 {y ) 


Example 3.4.3. We can use the above proposition to prove the result stated in Section 
3.3: if X has a N(/i, a 2 ) distribution, then Z := ( X — fi)/cr follows a standard normal 
distribution. Indeed, we have: 


Z = g(x) = {x-n)/(j <*=>■ g 1 (z) = /j,+ oz. 


It follows that 



for — oc < z < oo, because a > 0. Similarly, we can prove that if Y := aX + b , then Y 
has a normal distribution with parameters ay + b and a 2 a 2 . 

Example 3.4.4. Let X ~ U(0, 1) and Y = —0 lnX, where 9 > 0. First, note that the 
possible values of the random variable Y are those located in the interval (0, oo). Next, 
we have that g~ 1 (y) = e~ y ' e , so that 
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for 0 < y < oo. Thus, we can assert that Y has an exponential distribution with 
parameter 1/0. We make use of this result in simulation. Indeed, if we wish to generate an 
observation of a random variable following an exponential distribution with parameter 
A = 2 (for instance), it suffices to generate an observation x of a U(0,1) distribution, 
and then to apply the transformation y = —(1/2) \nx. Therefore, it is not necessary to 
write a special computer program to generate observations of exponential distributions. 
Computer languages in general, and even many pocket calculators, allow us to generate 
pseudo-random observations of uniform distributions. 

3.5 Characteristics of random variables 

In this section, we present some numerical quantities that enable us to characterize 
a random variable A. All the quantities are obtained by computing the mathematical 
expectation of various functions of X. 

Definition 3.5.1. We define the mathematical expectation of a function g(X) of 
a random variable X by 



E[g{X)\ = 


(3.8) 



'OO 


g{x) fx(x) dx if X is continuous. 


— oo 


Properties, (i) E[c] = c, for any constant c. 

(ii) E[cig(X) + Co] = c\E[g(X)] + Co, for all constants c\ and Co- 

Remarks, (i) E is therefore a linear operator. 

(ii) The mathematical expectation may be infinite and it may even not exist. 

(iii) If A is a random variable of mixed type , that is, a random variable that is discrete 
and continuous at the same time, then E[g(X)\ can be computed by decomposing the 
problem into two parts. For example, suppose that we toss a coin for which the proba¬ 
bility of getting “tails” is 3/8. If we get “tails,” then the random variable X takes on 
the value 1; otherwise, A is a number taken at random in the interval [2,4]. We can 
obtain E[g( A)] as follows: 
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Definition 3.5.2. The mean (or the expected value,) of a random variable X is given 
by 


y^XjPxjxj) if X is discrete, 


Mx = E[X] = { 




/ OO 

xfx{x)dx if X is continuous. 

-oo 


(3.9) 


Example 3.5.1. Let X ~ Poi(A). We have: 


A* 


00 \ x ~ 1 


Mx = E--S--T7 --rr = e_A A'V" 7 - — 

h 2:1 P+- 1 )' Fd x - i v 


e_A A { 1 + ^| + ^r + -- -! > = e_A Ae A = A. 


Example 3.5.2. Let X ~ Exp(A). We calculate 


-f 


5 1 1 

Mx = / x\e~ Xx dx = \ - —z = - 
Jo A A 


because, in general, if a > 0 , then we have: 


L 


oo poo 

-ax j_ y=^x 


x n e ~ ax dx 


f© 


e 


.-v d V 




(3.10) 


a n + 1 J o a n+1 a n+1 * 

Remark. We could also have integrated the function x\e~ Xx by parts. 

Definition 3.5.3. A median of a random variable X is a value (= x) for which 

P[X < x m \ > ^ and P[X > x m ] > 


Remarks, (i) When X is a discrete random variable, the median is not necessarily 
unique. It is not unique if there exists a real number a such that Fx{p) = 1/2. For 
example, let X ~ B (n = 2,p = 1/2). We have that Fx( 1) = P[X = 0] + P[X = 1] = 
1/4 + 1/2 = 3/4. In this case, the number x m = 1 satisfies the above definition, because 
P[X < 1] = 3/4 >1/2 and P[X > 1] = 3/4 > 1/2 as well. Furthermore, x m = 1 is the 
only number for which both inequalities are satisfied at the same time. On the other 
hand, let 
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X 

1 2 3 

Px{x) 

1/4 1/4 1/2 


The real number 2 satisfies the definition of the median, but so does the number 2.5 
(for instance). In fact, every number in the interval [2,3] is a median of X. Remark 
that if we change the probabilities as follows: P[X = 1] = 1/4, P[X = 2 ] = 1/8, and 
P[X = 3] = 5/8, then the median x m = 3 is unique. Likewise, if P[X = 1] = 1/4, 
P[X = 2 ] =3/8, and P[X = 3] =3/8, then x m = 2 is the unique median of X. 

(ii) When X is a continuous random variable taking all its values in a single (finite or 
infinite) interval, the median is unique and can be defined as follows: 

P[X<x m ] = 1/2 (=>P[X>x m ] = l/2). 


Example 3.5.1 (continued). Suppose that X ~ Poi( 2 ). We find in Table B. 2 , 
page 278, that P[X < 1] ~ 0.4060 and P[X < 2 ] 0.6767. There is no number 
x m such that P[X < x m \ = P[X > x m \ = 1 / 2 . However, we have: 

P[X < 2 ] ~ 0.6767 > 1/2 and P[X > 2 ] ~ 1 - 0.4060 > 1 / 2 . 

Moreover, 2 is the only number for which both inequalities are satisfied. Hence, x m = 2 
is the median of X. 


Example 3.5.2 (continued). If X ~ Exp (A), then we have: 


P[X < 


f 


Xe~ Xx dx = —e~ Xx 


= 1 — e 


— Xx T1 


It follows that 

P[X < Xm] = l / 2 

We can check that we indeed have: 
In 2 


1 - e ~ Xxm = 1/2 


In 2 


— 


X > 


A 


poo 

I In 2 
J — 


\e~ Xx dx = —e~ Xx 


In 2 

* w 


exp \ ~ X ~Y }■ = V 2 - 


Thus, the median is given by x m = (In2)/A. 

The median is useful when the random variable X may take on very large values 
(in absolute value) as compared to the others. Indeed, the median is less influenced by 
these extreme values than the mean nx is. 

The median, in the continuous case, is a particular case of the notion of quantile. 
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Definition 3.5.4. Let X be a continuous random variable whose set of possible values 
is an arbitrary interval (a, b). The number x p is called the 100(1 — p)th quantile of X 

if 


P[X <x p ] = l-p, 


where 0 < p < 1. 


If 100 p is an integer, then x p is also called the 100(1 — p)th percentile of X. The 
median of a continuous random variable is therefore the 50th percentile of X. The 25th 
percentile is also known as the first quartile , the 50th percentile is the second quartile , 
and the 75th percentile is the third quartile. Finally, the difference between the third 
and the first quartile is called the interquartile range. 


Definition 3.5.5. A mode of a random variable X is any value x that corresponds to 
a local maximum for px (x) or fx (x). 

Remark. The mode is thus not necessarily unique. A distribution having a single mode 
is said to be unimodal. 


Example 3.5.1 (continued). Let X ~ Poi(2). From Table B.2, page 278, we obtain 
that P[X = 0] ~ 0.135, P[X = 1] ~ 0.271, P[X = 2] ~ 0.271, P[X m 3] ~ 0.180, and 
so on. Hence, X has two modes: at x = 1 and at x = 2. 

Remark. We indeed have: 

P[x = 1] = e-X = e -2^! = P [x = 2] (~ 0.271). 

That is, the two probabilities are exactly equal. 

Example 3.5.2 (continued). Let X ~ Exp(A). Then, X being a continuous random 
variable, we can use differential calculus to find its mode(s). We have: 

for all x G (0, oo). However, because fx{x) is a strictly decreasing function, we can 
assert that the mode of X is at x = 0 + [fx (x) tends to a minimum as x —> oo]. 

The various quantities defined above are measures of central position. We continue 
by defining measures of dispersion. 

Definition 3.5.6. The range of a random variable is the difference between the largest 
and the smallest value that this variable can take. 

For example, the range of a random variable X ^ B(n,p) is equal to n — 0 = n. 
Likewise, if X ~ Exp(A), then its range is oo — 0 = oc. 
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Definition 3.5.7. The variance of a random variable X is defined by 

oo 

— Hx) 2 px(%i) if X is discrete, 


c4 = VAR[X] = { 


2=1 


/ OO 

(x — Hx) 2 fx(%) dx if X is continuous. 

-oo 


Remarks, (i) We deduce at once from the definition that the variance of X is always 
nonnegative (it can be infinite). Actually, it is strictly positive, except if the random 
variable X is a constant, which is a degenerate random variable. Finally, the larger the 
variance is, the more spread out is the distribution of the random variable around its 
mean. 

(ii) We also define the standard deviation of a random variable X by 

STD[V] = ^/VAR\X} = a x . 

We often prefer to work with the standard deviation rather than with the variance of 
a random variable, because it is easier to interpret. Indeed, the standard deviation is 
expressed in the same units of measure as X, whereas the units of the variance are the 
squared units of X. 

Now, we may write that 

VAR[V] = E[(X - E[X}) 2 ], 

and we can show that 

VAR[V] = E[X 2 } - ( E[X ]) 2 . (3.11) 


Example 3.5.1 (continued). When X ~ Poi(A), we calculate 


\ x 00 \x — 1 00 J \x 

E[X 2 ] = ]T x 2 e~ x — = e ~ x \£ = e ~ x \^ 


x=0 
-A 


d A 


1 ( x ~ 1 )! 

r=l v 7 


^ d\ (x — 1)! 

X=1 v 7 


d 


e '~ X T E An = e ~ Xx T ( AeA ) 

d\ ^ (x - 1)! dX x ’ 

X = 1 V 7 


= e _A A(e A + Ae A ) = A + A 2 . 

Then, using Formula (3.11) with E[X] = A, we find that 

VAR[X] = (A + A 2 ) - (A) 2 = A. 

Thus, in the case of the Poisson distribution, its parameter A is both its mean and its 


variance. 
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Example 3.5.2 (continued). We already found that if X ~ Exp(A), then E[X] = 1/A. 
We now calculate [see (3.10)] 

POO 91 9 

E[X 2 ] = j x 2 \e~ Xx dx = 

It follows that 

vAR m=^-G) 2= i- 

Note that, in the case of the exponential distribution, its mean and its standard deviation 
are equal. 

Table jd^page 89, gives the mean and the variance of the various probability dis¬ 
tributions found in Sections 3.2 and 3.3. 


The main properties of the mathematical operator VAR are the following: 

(i) VAR[c] = 0, for any constant c. 

(ii) VAR[cig(V) +co] = c\ VAR[g(V)], for any constants c\ and Co- Thus, the operator 
VAR is not linear. 

The mean and the variance of a random variable are particular cases of the quantities 
known as the moments of this variable. 


Definition 3.5.8. The moment of order k or Mh-order (or simply kth) moment 
of a random variable X about a point a is defined by 


E[(X - a) k ] = 1 


OO 

— a) k Px(%i) X is discrete, 

i=1 



a) k fx(x) dx if X is continuous. 


Particular cases 

(i) The /cth-order moment about the origin , or noncentral moment , of X is 

OO /»OC 

E[X k ] = n' k = ^2 x iPx(xi) or / x k f x (x)dx 
i =i 

for k = 0,1,... . We have that fi' Q = 1 and p! x = fix = E[X]. 

(ii) The /cth-order moment about the mean , or central moment , of X is 

00 POO 

E[(X - nx) k ] = Hk = " Hx) k Px(xi) or / (x - px) k fx{x) dx 

i =1 J ~°° 

for k = 0,1,... . We have that fio = 1, fii = 0, and fi 2 = cr\. 
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Table 3.1. Means and variances of the probability distributions of Sections 3.2 and 3.3 


Distribution 

Parameters 

Mean 

Variance 

Bernoulli 

P 

p 

pq 

Binomial 

n and p 

np 

npq 


N, n ,and d 

d 

d ( d\ f N — n\ 

Hypergeometric 

U 'N 

U N V n) \N-l) 

Geometric 

V 

1 



p 

p 2 

Pascal 

r and p 

r 

rq 


p 

p 2 

Poisson 

A 

A 

A 

Uniform 

[a,b] 

a b 

2 

( 6 -a ) 2 

12 

Exponential 

A 

1 

A 

1 

A 2 

Laplace 

A 

0 

2 

A 2 

Gamma 

a and A 

a 

A 

a 

A 2 

Weibull 

A and /3 

ni + js- 1 ) 

r(i + 2/?- 1 )-r 2 (i + /i- 1 ) 

AV/5 

a 2 // 3 

Normal 

pi and a 2 

L 

a 2 

Beta 

a and f3 

a 

a/3 

QL -\~ /3 

(a + /3 + l)(a + /3 ) 2 


Lognormal 

pi and cr 2 

^cr 2 

e 2/i+o- 2 (g(T 2 _ J) 
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Example 3.5.2 (continued). In the case when X ~ Exp(A), we calculate [see (3.10)] 

r°° u b\ 

E\X k 1 = / x k Xe~ Xx dx = A—r = xr- 

1 J J 0 A k + l \ k 

Remark. We can show, making use of Newton’s binomial formula, that 

Mfc = E[(X - n x ) k ] = £(-1)* QmUa4- 

This formula enables us to check that VAR[X] = E[X 2 ] — ( E[X ]) 2 , which may be 
rewritten as follows: 

^2 = /4 — Rx = /4 — (Mi) 2 - 

Two other quantities that are used to characterize the distribution of a random 
variable X are the skewness and kurtosis coefficients. These two coefficients are defined 
in terms of the moments of X. 

First, the quantity /x 3 =. E[(X — Hx) 3 } is used to measure the degree of asymmetry 
of probability distributions. If the distribution of X is symmetrical with respect to its 
mean /ix , then /i 3 = 0 (if we assume that // 3 exists). If // 3 > 0 (resp., < 0), then the 
distribution is said to be right-skewed (resp., left-skewed ). 

Remark. Actually, if the distribution is symmetrical with respect to its mean and if all 
its central moments exist, then we may write that /i 2 /c+i = 0, for k = 0, 1 ,... . 

Definition 3.5.9. The skewness (coefficient) of a random variable X is defined by 


Remarks, (i) The coefficient f3\ is a unitless quantity. 

(ii) Some authors prefer to work with the coefficient 71 := 

Example 3.5.2 (continued). We can write that 

= E[(X - vx) 3 ] = E[X 3 - V 2 + 3/4V - /&]. 

As shown in Chapter 4, the mathematical expectation of a linear combination of random 
variables can be obtained by replacing each variable by its mean (by linearity of the 
expectation operator). It follows that 

M3 = E[x 3 } - 3 nxE[X 2 ] + 3 &E[X] - 

Making use of the formula E[X k ] = k\/X k , we obtain: 
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_ 3! 3 2! 3 1! 1 _ 2 

M3 “ V ~ A V + V A “ V “ 


So, we find that 


p i = 


(2/A 3 ) 


3\2 


4. 


(1/A 2 ) 3 

Thus, all the exponential distributions have the same skewness /?i, whatever the value 
of the param eter A is, which reflects the fact that they all have the same shape (see 
Figure 3.11). 



X 


Fig. 3.11. Skewness coefficient of exponential distributions. 


Example 3.5.3. Let X ~ U(a, 5). The density function fx{%) is constant, therefore 
it is symmetrical with respect to nx = (n + 6)/2. Given that all the moments of the 
random variable X exist (because X is bounded by a and 6 ), it follows that (3i = 0 (see 
Figure 3.12). 

Definition 3.5.10. The kurtosis (coefficient) of a random variable X is the unitless 
quantity 


Remarks, (i) When the distribution of X is symmetrical, measures the relative thick¬ 
ness of the tails of the distribution with respect to its central part. 

(ii) As in the case of the coefficient /?i, some authors use a different coefficient: 72 := 
P 2 ~ 3. Because @2 = 3 if X ~ N(/i, cr 2 ), the quantity 72 is chosen so that the kurtosis 
of all normal distributions is equal to zero. 

Example 3.5.2 (continued). Making use once again of the formula E[X k ] = k\/\ k , 
we may write that 
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f x (x) 

As 


1/(b-a) 


p r° 


a 


—I— 

(a+b)/2 


b 


■> 

x 


Fig. 3.12. Skewness coefficient of the uniform distribution. 


M4 = E[(X - Mx ) 4 ] = E[X A - 4fi X X 3 + 6/4V 2 - 4,4 X + ,4] 

= E[X 4 ] - 4 M x^[V 3 ] + 6 VxE[X 2 ] - 4 p 3 x E[X] + /4 
_ 4! 4 3! 6 2! 4 1 1 _ 9 

“ V “ A V + V V “ V A + X~X' 

Therefore, we have: 

JAL-9 

P2 {1/A 2 ) 2 ' 

which is independent of the parameter A. 

Example 3.5.4. As we mentioned above, we can show that if X ~ N(/i, cr 2 ), then 
02 = 3. A random variable whose density function looks a lot like that of a normal 
distribution is the Student distribution with n degrees of freedom, which is important 
in statistics. Its kurtosis is given by 

02 = ——t +3 if n > 4. 

Note that the coefficient 02 is larger than that of normal distributions, reflecting the 
fact that the density function fx(x) tends less rapidly to 0 as x tends to Too than it 
does in the case of normal distributions. However, note also that 02 decreases to 3 as n 
tends to infinity. 

Example 3.5.5. Let X ~ B(l,p). That is, X has a Bernoulli distribution with param¬ 
eter p. We have: 

l 

/4 = E[X k } = x k Px(x) = 0 k q + 1 k p=p 

x=0 

for k = 1, 2 ,... .In particular, E[X] = fi[ = p, so that 
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Mfc = E[{X - p) k ] = ]T(x - p) k px(x) = (~p) k q + (1 - p) k p. 

x=0 

It follows that VAR[X] — fi2 — p 2 q + q 2 p = pq{p + q) = pg. We also have: 

M3 = -p 3 q + q 3 p = pq(-p 2 + q 2 ) 

and 

Then, we calculate 

Pi 

and 


Remark. The coefficients Pi and P 2 are often used to compare an arbitrary distribution 
with a normal distribution, for which = 0 (by symmetry) and P 2 = 3. For exam¬ 
ple, if X follows a chi-square distribution with n degrees of freedom (see the gamma 
distribution above), then = 8/n and 

Note that Pi decreases to 0 and P 2 decreases to 3 as n tends to infinity. Actually, the 
density function of X tends to that of a normal distribution, which is a consequence of 
the central limit theorem (see Chapter 4). 

We end this section by giving a proposition that enables us to obtain a bound for 
a certain probability, when the mean and the variance of the random variable involved 
are (known and) finite. 

Proposition 3.5.1. (Bienayme—Chebyshev inequality) For an arbitrary constant 
a > 0, we have: 

P[px ~ clcjx < X < fix + clcfx] >1-^2 

for any random variable X whose variance VAR[X] = o\ is finite. 

Remarks, (i) For the variance of X to be finite, its mean E[X] must also be finite. 

(ii) Generally, we say that the mean (resp., the variance) of a random variable X does 
not exist if E[X] = Too (resp., VAR[X] = 00 ). This is the reason why in many books 
the validity condition for the Bienayme-Chebyshev inequality is that the variance of 
X must exist. We can, however, distinguish between the case when the mean (or the 
variance) of X is infinite and that when it does not exist. For instance, the mean of 
a Cauchy distribution (see p. 8) does not exist, because we find that E[X] = oc — 00 , 
which is not defined. Then, its variance does not exist either. 


M4 = p A q + q A p = pq(p 3 + q 3 )- 


_ i4 p 2 q 2 {-p 2 + q 2 ) 2 p 3 q 3 0 

“ a 6 ( pq) 3 q p PQ 

„ = M4 = pg(p 3 + <f) = P^,<f_ 

2 - o- 4 (pq) 2 q p' 
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Example 3.5.6. If X is a random variable for which E[X\ = 0 and VAR[X] = 1, then 
we may write that 

P[—3 <X<3]>1-1 = 0.8. 

In the case of a N(0,1) distribution, this probability is actually greater than 99.7% (as 
mentioned above). If X follows a uniform distribution on the interval [—\/3, a/3], so that 
its mean is zero and its variance is equal to 1, then the probability in question is equal 
to 1 (because [—\/3, y/3\ C [—3,3]). 


3.6 Exercises for Chapter 3 


Solved exercises 


Question no. 1 

Let 


Px (x) = 


where a > 0. Find the constant a. 


a/S if x = —1, 
a/4 if x = 0, 
a/8 if x = 1, 


Question no. 2 

Let 

fx ( X ) = | (l - X 2 ) if -1 < X < 1. 


Calculate Fx (0). 


Question no. 3 

Calculate the standard deviation of X if 


Px (x) 


1 

3 


for x = 1, 2 or 3. 


Question no. 4 

Suppose that 

fx(x) = 2x if 0 < x < 1. 

Calculate ^[V 1 / 2 ]. 

Question no. 5 

Calculate the 25th percentile of the random variable X for which 
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x 2 

Fx (x) = — if 0 < x < 2. 

Question no. 6 

Let 

fx (x) = ^ if 0 < x < 2. 

We define Y = X + 1. Calculate fy (y)- 

Question no. 7 

Two items are taken at random and without replacement from a box containing 5 
brand A and 10 brand B items. Let X be the number of brand A items among the two 
selected. What is the probability distribution of X and its parameters? 

Question no. 8 

Suppose that X^B(n = 5,p = 0.25). What is the mode of X, that is, the most 
probable value of XI 

Question no. 9 

Ten percent of the articles produced by a certain machine are defective. If 10 (inde¬ 
pendent) articles fabricated by this machine are taken at random, what is the probability 
that exactly two of them are defective? 

Question no. 10 

Calculate P [X > 1 \ X < 1} ii X ~ Poi(A = 5). 

Question no. 11 

Failures occur according to a Poisson process with rate A = 2 per day. Calculate the 
probability that, in the course of two consecutive days, exactly one failure (in all) will 
occur. 

Question no. 12 

In a certain lake, there are 200 type I and 50 type II fish. We draw, without replace¬ 
ment, five fish from the lake. Use a binomial distribution to calculate approximately the 
probability that we get no type II fish. 

Question no. 13 

Let X ~ B(n = 50 ,p = 0.01). Use a Poisson distribution to calculate approximately 
P[X > 4]. 

Question no. 14 

A fair coin is tossed until “heads” is obtained. What is the probability that the 
random experiment will end on the fifth toss? 

Question no. 15 

The lifetime X of a radio has an exponential distribution with mean equal to ten 
years. What is the probability that a ten-year-old radio will still work after ten additional 
years? 
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Question no. 16 

Suppose that X ~ Exp(A). Find the value of A such that P[X > 1] = 2 P[X > 2]. 

Question no. 17 

Let X ~ G(a = 2, A = 1). What other probability distribution can be used to 
calculate exactly P[X < 4]? Give also the parameter(s) of this distribution. 

Question no. 18 

The customers of a salesman arrive according to a Poisson process, at the (average) 
rate of two customers per hour. What is the distribution of the time X needed until ten 
customers have arrived? Give also the parameter(s) of this distribution. 

Question no. 19 

Calculate P[\X\ < 1/2] if X - N(0,1). 

Question no. 20 

Calculate the 10th percentile of X ~ N(l, 2). 

Question no. 21 

Devices are made up of five independent components. A given device operates if at 
least four of its five components are active. Each component operates with probability 
0.95. We receive a very large batch of these devices. We inspect the devices, taken at 
random and with replacement, one at a time until a first device that does not operate 
has been obtained. 

(a) What is the probability that a device taken at random operates? 

(b) What is the expected value of the number of devices that will have to be inspected? 

Question no. 22 

City buses pass by a certain street corner, between 7:00 a.m. and 7:30 p.m., according 
to a Poisson process at the (average) rate of four per hour. 

(a) What is the probability that at least 30 minutes elapse between the first and the 
third bus? 

(b) What is the variance of the waiting time between the first and the third bus? 

(c) Given that a woman has been waiting for 5 minutes, what is the probability that 
she will have to wait 15 more minutes? 

Question no. 23 

Suppose that the length X (in meters) of an arbitrary parking place follows a 
N(/i, 0.01/i 2 ) distribution. 

(a) A man owns a luxury car whose length is 15% greater than the average length of a 
parking place. What proportion of free parking places can he use? 

(b) Suppose that /i = 4. What should be the length of a car if we want its owner to be 
able to use 90% of the free parking places? 
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Question no. 24 

A student gets the correct answers, on average, to half of the probability problems 
she attempts to solve. In an exam, there are ten independent questions. What is the 
probability that she can solve more than half? 

Question no. 25 

Let X be a random variable having a binomial distribution with parameters n = 100 
and p = 0.1. Use a Poisson distribution to calculate approximately P[X = 15]. 

Question no. 26 

Let 

fx(x) = \j ^ - x 2 for -y/f < x < y/f . 

Calculate P[X < 0]. 

Question no. 27 

The results of an intelligence quotient (IQ) test for the pupils of a certain elementary 
school showed that the IQ of these pupils follows (approximately) a normal distribution 
with parameters p = 100 and a 2 = 225. What total percentage of pupils have an IQ 
smaller than 91 or greater than 130? 

Question no. 28 

A certain assembly requires 59 nondefective transistors. We have at our disposal 
60 transistors taken at random from those fabricated by a machine that is known to 
produce 5% defective transistors. 

(a) Calculate the probability that the assembly can be made. 

(b) Obtain an approximate value of the probability in (a) with the help of a Poisson 
distribution. 

(c) Suppose that, in fact, we have a very large number of transistors at our disposal, 
of which 5% are defective. What is the probability that we will have to take exactly 60 
transistors at random to get 59 nondefective ones? 

Question no. 29 

Let X be the delivery time (in days) for a certain product. We know that X is a 
continuous random variable whose mean is 7 and whose standard deviation is equal to 
1. Determine a time interval for which, whatever the distribution of X is, we can assert 
that the delivery times will be in this interval with a probability of at least 0.9. 

Question no. 30 

The entropy H of a continuous random variable X is defined by H = E[— In fx(X)], 
where fx is the density function of X and In denotes the natural logarithm. Calculate 
the entropy of a Gaussian random variable with zero mean and variance a 2 = 2. 
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Question no. 31 

The number N of devices that a technician must try to repair during the course of an 
arbitrary workday is a random variable having a geometric distribution with parameter 
p = 1/8. We estimate the probability that he manages to repair a given device to be 
equal to 0.95, independently from one device to another. 

(a) What is the probability that the technician manages to repair exactly five devices, 
before his second failure, during a given workday, if we assume that he will receive at 
least seven out-of-order devices in the course of this particular workday? 

(b) If, in the course of a given workday, the technician received exactly ten devices for 
repair, what is the probability that he managed to repair exactly eight of those? 

(c) Use a Poisson distribution to calculate approximately the probability in part (b). 

(d) Suppose that exactly eight of the ten devices in part (b) have indeed been repaired. 
If we take three devices at random and without replacement among the ten that the 
technician had to repair, what is the probability that the two devices he could not repair 
are among those? 

Question no. 32 

The number X of raisins in an arbitrary cookie has a Poisson distribution with 
parameter A. What value of A must be chosen if we want the probability that at most 
2 cookies, in a bag of 20, contain no raisins to be 0.925? 

Question no. 33 

The storage tank of a gas station is filled once a week. Suppose that the weekly 
demand X (in thousands of liters) is a random variable following an exponential distri¬ 
bution with parameter A = 1/10. What must the capacity of the storage tank be if we 
want the probability of exhausting the weekly supply to be 0.01? 

Question no. 34 

We are interested in the lifetime X (in years) of a machine. From past experience, 
we estimate the probability that a machine of this type lasts for more than nine years 
to be 0.1. 

(a) We propose the following model for the density function of X: 

fx(x) = -— a — B for X > 0, 

(x+ 1) 

where a > 0 and b > 1. Find the constants a and b. 

(b) If we propose a normal distribution with mean p = 7 for X, what must the value of 
the parameter a be? 

(c) We consider ten machines of this type, which are assumed to be independent. Cal¬ 
culate the probability that eight or nine of these machines last for less than nine years. 
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Exercises 


Question no. 1 

A continuous random variable has the following density function: 



(a) Calculate the constant c. 

(b) Obtain (integrating by parts) the distribution function Fx(x). 

(c) Find the mean of X. 

(d) Calculate the standard deviation of X. 

(e) Show that the median of X is located between 3 and 4. 

Indication. See (3.10). 

Question no. 2 

A merchant receives a batch of 100 electrical devices. To save time, she decides to use 
the following sampling plan: she takes two devices, at random and without replacement, 
and she decides to accept the whole batch if the two devices selected are nondefective. 
Let X be the random variable denoting the number of defective devices in the sample. 

(a) Give the probability distribution of X as well as the parameters of this distribution. 

(b) If the batch contains exactly two defective devices, calculate the probability that it 
is accepted. 

(c) Approximate the probability computed in part (b) with the help of a binomial 
distribution. 

(d) Approximate the probability calculated in part (c) by using a Poisson distribution. 
Remark. Give your answers with four decimals. 

Question no. 3 

In a dart game, the player aims at a circular target having a radius of 25 centimeters. 
Let X be the distance (in centimeters) between the dart’s impact point and the center 
of the target. Suppose that 



where c is a constant. 

(a) Calculate 

(i) the constant c; 

(ii) the density function, fx(x), of X\ 

(iii) the mean of X; 
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(iv) the probability P[X < 10 | X > 5]. 

(b) It costs $1 to throw a dart and the player wins 

r $10 if X <r, 

< $1 if r < X < 2r, 

[ $0 if 2r < X < 25. 

For what value of r is the average gain of the player equal to $0.25? 

Question no. 4 

Telephone calls arrive at an exchange according to a Poisson process with rate A per 
minute. We know, from past experience, that the probability of receiving exactly one 
call during a one-minute period is three times that of receiving no calls during the same 
time period. For each of the following questions, give the probability distribution of the 
random variable and calculate, if the case may be, the requested probability. 

(a) Let X be the number of calls received over a one-minute period. What is the prob¬ 
ability P[2 < X < 4]? 

(b) Let Y be the number of calls received over a three-minute period. Calculate the 
probability P[Y > 4]. 

(c) Let W\ be the waiting time (in minutes) until the first call, from time t = 0. Calculate 
P[W 1 < 1]. 

(d) Let W 2 be the waiting time (in minutes) between the first and the second call. 
Calculate P[W 2 > 1]. 

(e) Let W be the waiting time until the second call, from time t = 0. Give the probability 
distribution of W as well as its parameters. 

(f) We consider 100 consecutive one-minute periods and we denote by U the number of 
periods during which no calls were received. Calculate P[U < 1]. 

Question no. 5 

We have ten (independent) machines at our disposal, each producing 2% defective 
items. 

(a) How many items will be fabricated by the first machine, on average, before it pro¬ 
duces a first defective item? 

(b) We take at random one item fabricated by each machine. What is the probability 
that at most two items among the ten selected are defective? 

(c) Redo part (b), using a Poisson approximation. 

(d) How many items fabricated by the first machine must be taken, at a minimum, in 
order that the probability of obtaining at least one defective item be greater than 1/2 
(assuming that the items are independent of one another)? 
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Question no. 6 

We are interested in the proportion 0 of defectives in a batch of manufactured articles. 
We decide to draw, at random and with replacement, a sample of 20 articles from the 
batch. 

(a) We denote by X the number of defective articles in the sample. 

(i) Give the probability function px(x). 

(ii) Give the probability function px{%) if the draws are made without replacement 
and if there are 1000 articles in the batch. 

(b) If the draws are made with replacement and if 0 = 0.25, calculate 

(i) P[X = 10]; 

(ii) P[X > 10], by using a Poisson approximation. 

Question no. 7 

Let X be a random variable having the density function 




if — 1 < x < 1, 
if \x\ > 1, 


where c is a positive constant. Calculate (a) the constant c; (b) the mean of X ; (c) the 
variance of X ; (d) the distribution function Fx{x). 

Question no. 8 

The average number of faulty articles produced by a certain manufacturing process 
is equal to six per 25-minute period, according to a Poisson process. We consider a given 
production hour divided into 12 five-minute periods. Let 

X be the number of faulty articles produced over a five-minute period; 

Y be the number of five-minute periods needed to obtain a first period during which 
no faulty articles are produced; 

Z be the number of periods, among the 12, during which no faulty articles are 
produced. 

(a) Give the distribution of X, T, and Z as well as their parameter(s). 

(b) During which period, on average, will no faulty articles be produced for the first 
time? 

(c) What is the probability that, during exactly 2 of the 12 periods, will no faulty articles 
be produced? 

(d) What is the probability that exactly two faulty articles have been produced during 
a given five-minute period, given that at most four faulty articles have been produced 
during this time period? 
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Question no. 9 

Calculate the variance of y/X if 


px{x) 


1/4 if x = 0, 
1/2 if x = 1, 
1/4 if x = 2. 


Question no. 10 

Calculate the 30th percentile of the continuous random variable X whose density 
function is 


( x if 0 < x < 
\ 0 elsewhere. 


Question no. 11 

A 300-page book contains 200 typos. Calculate, using a Poisson distribution, the 
probability that a particular page contains at least two typos. 

Question no. 12 

Based on past data, we estimate that 85% of the articles produced by a certain ma¬ 
chine are defective. If the machine produces 20 articles per hour, what is the probability 
that 8 or 9 articles fabricated over a 30-minute period are defective? 

Question no. 13 

Calculate P[X >8] if 

f x/2, if t* > Q 

f v (r) = < 96 Ai x — u , 

J ' I 0 elsewhere. 


Question no. 14 

The lifetime of a certain electronic component follows an exponential distribution 
with mean equal to five years. Knowing that a given component is one year old, what 
is the probability that it will fail during its fourth year of operation? 

Question no. 15 

A security system is composed of ten components operating independently of one 
another. For the system to be operational, at least five components must be active. To 
check whether the system is operational, we periodically inspect four of its components 
taken at random (and without replacement). The system is deemed operational if at 
least three of the four components inspected are active. If, actually, only four of the ten 
components are active, what is the probability that the system is deemed operational? 


Question no. 16 

Calculate the 25th percentile of a continuous random variable X whose density 
function is 


fx 



xe x / 2 if x > 0, 

0 if x < 0. 
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Question no. 17 

Calculate the probability of obtaining exactly three “tails” in 15 (independent) tosses 
of a coin for which the probability of getting “tails” is 0.4. 

Question no. 18 

A sample of four parts is drawn without replacement from a lot of ten parts, of 
which one is defective. Calculate the probability that the defective part is included in 
the sample. 

Question no. 19 

Customers arrive at a counter, according to a Poisson process, at the average rate of 
five per minute. What is the probability that the number of customers is greater than 
or equal to ten in a given three-minute period? 

Question no. 20 

The arrivals of customers at a counter constitute a Poisson process with rate A = 1 
per two-minute time period. Calculate the probability that the waiting time until the 
next customer (from any time instant) is smaller than ten minutes. 

Question no. 21 

Let X be a random variable having a N(10, 2) distribution. Find its 90th percentile. 

Question no. 22 


Let 



0 if x < 0, 
x/2 if 0 < x < 1, 


x/6 + 1/3 if 1 < a; < 4, 
1 if x > 4 


be the distribution function of the continuous random variable X. 

(a) Calculate the density function of X. 

(b) What is the 75th percentile of XI 

(c) Calculate the expected value of X. 

(d) Calculate E[l/X]. 

(e) We define 



-1 if X < 1, 
1 if X > 1. 


(i) Find Fy (0). 

(ii) Calculate the variance of Y. 


Question no. 23 

A box contains 100 brand A and 50 brand B transistors. 
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(a) Transistors are drawn one by one, at random and with replacement, until a first brand 
B transistor has been obtained. What is the probability that nine or ten transistors will 
have to be drawn? 

(b) What is the minimum number of transistors that must be drawn, at random and 
with replacement, if we want the probability of obtaining only brand A transistors to 
be smaller than 1/3? 

Question no. 24 

Parts are fabricated in series. To perform a quality control check, every hour we 
draw, at random and without replacement, 10 parts from a box containing 25. The 
fabrication process is deemed under (statistical) control if at most one of the inspected 
parts is defective. 

(a) If all the inspected boxes contain exactly two defective parts, what is the probability 
that the fabrication process is deemed under control at least seven times during the 
course of an eight-hour workday? 

(b) Use a Poisson distribution to evaluate approximately the probability calculated in 
part (a). 

(c) Knowing that, on the last quality control check performed in part (a), the fabrication 
process was deemed under control, what is the probability that the corresponding sample 
of 10 parts contained no defectives? 

Question no. 25 

Let X be a random variable whose probability function is given by 


X 

-10 3 

Px{x) 

0.5 0.2 0.3 


(a) Calculate the standard deviation of X. 

(b) Calculate the mathematical expectation of X 3 . 

(c) Find the distribution function of X. 

(d) We define Y = X 2 + X + 1 . Find p Y (y). 

Question no. 26 

In a particular factory, there were 25 industrial accidents in 2005. Every year, the 
factory closes for summer holidays for two weeks in July. Answer the following questions, 
assuming that the industrial accidents occur according to a Poisson process. 

(a) What is the probability that exactly one of the 25 accidents occurred during the 
first two weeks of 2005? 

(b) If the average rate of industrial accidents remains the same in 2006, what is the 
probability that there will be exactly one accident during the first two weeks of that 
year? 
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Question no. 27 

In a certain lottery, four balls are drawn at random and without replacement among 
20 balls numbered from 1 to 20. The player wins a prize if the combination that he has 
chosen comprises at least two winning numbers. A man decides to buy one ticket per 
week until he has won a prize. What is the probability that he will have to buy less 
than ten tickets? 

Question no. 28 

The density function of the random variable X is given by 

£ , \ _ f 6^ (1 — x) if 0 < x < 1, 

1 0 elsewhere. 


(a) Calculate the mathematical expectation of 1/X. 

(b) Obtain the distribution function of X. 

(c) We define 


f 2 if X > 1/4, 
| 0 if X < 1/4. 


Calculate E[Y k ], where k is a natural number, 

(d) Let Z = X 2 . Find the density function of Z. 


Question no. 29 

The concentration X of reactant in a chemical reaction is a random variable whose 
density function is 

£ f \ ( 2 (1 — x) if 0 < x < 1, 

/v(j,) = \ 0 elsewhere. 

The amount Y (in grams) of final product is given by Y = 3X. 

(a) What is the probability that the concentration of reactant is equal to 1/2? Justify. 

(b) Calculate the variance of Y. 

(c) Obtain the density function of Y. 

(d) What is the minimum amount of final product that, in 95% of the cases, will not 
be exceeded? 


Question no. 30 

An insurance company employs 20 salespersons. Each salesperson works at the office 
or on the road. We estimate that a given salesperson is at the office at 2:30 p.m., on any 
workday, with probability 0.2, independently of the other workdays and of the other 
salespersons. 

(a) The company wants to install a minimum number of desks, so that an arbitrary 
salesperson finds a free desk in at least 90% of the cases. Find this minimum number of 
desks. 
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(b) Calculate the minimum number of desks in part (a) by using a Poisson approxima¬ 
tion. 

(c) A woman telephoned the office at 2:30 p.m. on the last two workdays in order to talk 
to a particular salesperson. Given that she did not manage to talk to the salesperson in 
question, what is the probability that she will have to phone at least two more times, 
assuming that she always phones at 2:30 p.m.? 

Question no. 31 

Calculate XXR[e x ] if X is a random variable whose probability function is given by 


Px(x) 


1/4 if x = 0, 
1/4 if x = 1, 
1/2 if x = 4, 

0 otherwise. 


Question no. 32 

The density function of the random variable X is 

{ —x if — 1 < x < 0, 

X if 0 < X < 1, 

0 elsewhere. 


Calculate Fx{ 1/2). 

Question no. 33 

Let 

f v ( r ) = l V e if 0 < x < e, 

J ' ' \ 0 elsewhere. 

Calculate /y(?/), where Y := — 21nX. 

Question no. 34 

A lot contains 20 items, of which two are defective. Three items are drawn at random 
and with replacement. Given that at least one defective item was obtained, what is the 
probability that three defectives were obtained? 

Question no. 35 

Calls arrive at an exchange according to a Poisson process, at the average rate of two 
per minute. What is the probability that, during at least one of the first five minutes of 
a given hour, no calls arrive? 

Question no. 36 

A box contains 20 granite-type and 5 basalt-type rocks. Ten rocks are taken at ran¬ 
dom and without replacement. Use a binomial distribution to calculate approximately 
the probability of obtaining the 5 basalt-type rocks in the sample. 
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Question no. 37 

A fair coin is tossed until “heads” has been obtained ten times. What is the variance 
of the number of tosses needed to end the random experiment? 

Question no. 38 

Let X be a random variable having an exponential distribution with parameter A. 
We define Y = int(X)+l, where int(X) designates the integer part of X. Calculate 

Fy(y )• 

Question no. 39 

Let 

fc(.)={ 4 ’’ o 

Calculate the variance of X. 

Question no. 40 

Suppose that X ~ N(l,cr 2 ). Find a if P[— 1 < X < 3] = 1 / 2 . 

Question no. 41 

The distribution function of the discrete random variable X is given by 

( 0 if x < 0 , 

F x (x)= < 1/2 if 0 < x < 1, 

[ 1 if x > 1 . 

Calculate (a) px(x)] (b) £’[cos( 7 tX)]. 

Question no. 42 

At least half of the engines of an airplane must operate to enable it to fly. If each 
engine operates, independently of the others, with probability 0 . 6 , is an airplane having 
four engines more reliable than a two-engine airplane? Justify. 

Question no. 43 

We define Y = |X|, where X is a continuous random variable whose density function 
is 

{ 3/4 if -1 < x < 0, 

1/4 if 1 < x < 2, 

0 elsewhere. 

What is the 95th percentile of Y1 

Question no. 44 

The probability that a part produced by a certain machine conforms to the technical 
specifications is equal to 0.95, independently from one part to the other. We collect parts 
produced by this machine until we have obtained one part that conforms to the technical 
specifications. This random experiment is repeated on 15 consecutive (independent) 
days. Let X be the number of days, among the 15 days considered, during which we 
had to collect at least two parts to get one part conforming to the technical specifications. 


2 * if x > 0, 

if x < 0. 
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(a) What is the mean value of XI 

(b) Use a Poisson distribution to calculate approximately the conditional probability 


P[X = 2\X>1\. 


Question no. 45 

Ten samples of size 10 are drawn at random and without replacement from identical 
lots containing 100 articles, of which two are defective. A given lot is accepted if at most 
one defective article is found in the corresponding sample. What is the probability that 
less than nine of the ten lots are accepted? 

Question no. 46 

The number X of particles emitted by a certain radioactive source during a one-hour 
period is a random variable following a Poisson distribution with parameter A = In 5. 
Furthermore, we assume that the emissions of particles are independent from hour to 
hour. 

(a) (i) Calculate the probability that during at least 30 hours, among the 168 hours of 
a given week, no particles are emitted. 

(ii) Use a Poisson distribution to calculate approximately the probability in part (i). 

(b) Calculate the probability that the fourth hour, during which no particles are emitted, 
takes place over the course of the first day of the week considered in part (a). 

Question no. 47 

The duration X (in hours) of major power failures, in a given region, follows ap¬ 
proximately a normal distribution with mean fi = 2 and standard deviation a = 0.75. 
Find the duration xq for which the probability that an arbitrary major power failure 
lasts at least 30 minutes more than xq is equal to 0.06. 

Question no. 48 

A continuous random variable X has the following density function: 



where k > 0 is a constant. 

(a) Calculate the mean and the variance of X. 

(b) What is the effect of the constant k on the shape of the function /x? 

Remark. You can calculate (using a mathematical software package, if possible) the 
coefficients f3\ and /?2 to answer this question. 

Question no. 49 

In a particular region, the daily temperature X (in degrees Celsius) during the month 
of September has a normal distribution with parameters fi = 15 and a 2 = 25. 

(a) Let Y be the random variable designating the temperature, given that it is above 
17 degrees Celsius. That is, Y := X \ {X >17}. Calculate the density function of Y. 
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(b) Calculate the (exact) probability that during the month of September the temper¬ 
ature exceeds 17 degrees Celsius on exactly ten days. 

Question no. 50 

The amount X of rain (in millimeters) that falls over a 24-hour period, in a given 
region, is a random variable such that 

( 0 if x < 0, 

F x (x) = \ 3/4 if a: = 0, 

[ 1 - ±e~ x2 if x > 0. 

Calculate (a) the expected value of X\ (b) the mathematical expectation and the vari¬ 
ance of the random variable Y := e x / 2 . 

Reminder. The variable X is an example of what is known as a random variable of 
mixed type , because it can take on the value 0 with a positive probability, but all the 
positive real numbers have a zero probability of occurring (as in the case of a continuous 
random variable). To answer the previous questions, one must make use of the formulas 
for discrete and continuous random variables at the same time (see p. 83). 

Question no. 51 

The number of typos in a 500-page book has a Poisson distribution with parameter 
A = 2 per page, independently from one page to the other. 

(a) What is the probability that more than ten pages will have to be taken, at random 
and with replacement, to obtain three pages containing at least two typos each? 

(b) Suppose that there are actually 20 pages, among the 500, that contain exactly five 
typos each. 

(i) If 100 pages are taken, at random and without replacement, what is the proba¬ 
bility that less than five pages contain exactly five typos each? 

(ii) We consider 50 identical copies of this book. If the random experiment in part 
(i) is repeated for each of these books, what is the probability that, for exactly 30 of 
the 50 copies, less than five pages with exactly five typos each are obtained? 

Question no. 52 

A manufacturer sells an article at a fixed price s. He reimburses the purchase price 
to every customer who discovers that the weight of the article is smaller than a given 
weight wo and he recuperates the article, whose value of the reusable raw material is 
r (< s). The weight W follows approximately a normal distribution with mean fi and 
variance a 2 . An appropriate setting enables one to fix fi to any desirable value, but it 
is not possible to fix the value of a. The cost price C is a function of the weight of the 
article: C = a + (3W, where a and /3 are positive constants. 

(a) Give an expression for the profit Z in terms of W. 

(b) We can show that the average profit, z(/jl), is given by 
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z(n) = s — a — p/i — (s — r)P[W < wo\. 

Find the value /io of /i that maximizes 

Question no. 53 

In a collection of 20 rocks, 10 are of basalt type and 10 are of granite type. Five 
rocks are taken at random and without replacement to perform chemical analyses. Let 
X be the number of basalt-type rocks in the sample. 

(a) Give the probability distribution of X as well as its parameters. 

(b) Calculate the probability that the sample contains only rocks of the same type. 

Multiple choice questions 


Question no. 1 

Let 


f 1/2 if 0 < x < 1, 
| 1/ (2x) if 1 < x < e. 


Calculate P[X < 2 | X > 1]. 


(a) 


In 2 



( c ) 


2 In 2 


(d) In 2 (e) 1 


Question no. 2 

Suppose that X - U(0,1). Find E[(X - E[X}) 3 ]. 
(a) 0 (b) 1/4 (c) 1/3 (d) 1/2 (e) 2/3 


Question no. 3 

Let X ~ B (n = 2,p = 0.5). Calculate P[X > 1 | X < 1]. 


(a) 0 (b) 1/4 

(c) 1/2 

(d) 2/3 

( e ) 1 

Question no. 

4 



Find E[X 2 ] 

if p x ( 0) = 

e~ x and 




e~ x A 



Px{x) 


for x 


where A > 0. 

(a) A (b) A 2 + A (c) A 2 — A (d) A 2 (e) 2A 2 
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Question no. 5 

Let 



e x /2 if z<0, 
e~ x /2 if x > 0. 


Calculate the variance of X. 


(a) 1 (b) 3/2 (c) 2 (d) 3 (e) 4 


Question no. 6 

Suppose that 



Calculate px(0) H-px(l)- 

(a) 0 (b) 1/4 (c) 1/2 (d) 3/4 (e) 1 

Question no. 7 

Let fx{x) = 2xe~ x2 , for x > 0. We set F = lnX. Find fy{y), for any y G M. 

(a) 2e 2y e -e2?y (b) 2e 2y e~ y2 (c) 2e y e -e2y (d) 2e~ 2y (e) 

Question no. 8 

Calculate P[X < 5] if X ~ G(a = 5, A = 1/2). 

(a) 0.109 (b) 0.243 (c) 0.5 (d) 0.757 (e) 0.891 

Question no. 9 

Suppose that X is a discrete random variable whose set of possible values is 
{0,1,2, ...}. Calculate P[X = 0] if E[t x ] = e A ^ _1 \ where t is a real constant. 

(a) 0 (b) 1/4 (c) 1/2 (d) e -2A (e) e -A 

Question no. 10 

Calculate P[X 2 < 4] if X - Exp(A = 2). 

(a) 1 — e -4 (b) 2(1 — e -4 ) (c) (d) e -4 (e) 2e -4 

Question no. 11 

Calculate P[0 < X < 2] if 




e 


-3 


2 



112 3 Random variables 


Remark. The random variable X is of mixed type (see p. 83 and Exercise no. 50, p. 109). 
Note that the function F x (x) is discontinuous at the point x = 0. 


Question no. 12 

We define Y = |X|, where X is a continuous random variable whose density function 


is 



1/2 if —1 < x < 1, 
0 elsewhere. 


Find f Y (y)- 


(a) 1 if -1 < y < 1 (b) if -1 < y < 1 (c) y if 0 < y < 1 

(d) 2y if 0 < y < 1 (e) 1 if 0 < y < 1 


Question no. 13 

Let 


X 

1 4 9 

PX (x) 

1/4 1/4 1/2 


We have that E[\/X] = 9/4 and E[X] = 23/4. Calculate the standard deviation of 
\[X + 4. 


(a) 


\/IT 


(b)^ + 2 


( 0^+4 




(e)ii + 2 


Question no. 14 

Suppose that 


fx{x) 


l/eif0<x<e, 
0 elsewhere. 


Find a function g(x) such that if Y := g(X), then fyiv) = e y 1 , for y > 1. 
(a) (b) e^ -1 (c) (d) — lnx (e) lnx 


Question no. 15 

Calculate the third-order central moment of the discrete random variable X whose 
probability function is 


x 

-1 0 1 

Px (x) 

1/8 1/2 3/8 


(a) -3/32 (b) 0 (c) 1/64 (d) 3/32 (e) 1/4 

Question no. 16 

Let X be a continuous random variable defined on the interval (a, b). What is the 
density function of the random variable Y := F x (X)l 
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(a) It is not defined (b) 0 (c) 1 if 0 < y < 1 (d) fx(y) 

(e) 2y if 0 < y < 1 

Question no. 17 

The rate of suicides in a certain city is equal to four per month, according to a Poisson 
distribution, independently from one month to the other. Calculate the probability that 
during at least one month over the course of a given year there will be at least eight 
suicides. 

(a) 0.0520 (b) 0.1263 (c) 0.4731 (d) 0.5269 (e) 0.8737 

Question no. 18 

A telephone survey has been conducted to determine public opinion on the con¬ 
struction of a new nuclear plant. There are 150,000 subscribers whose phone numbers 
are published in the telephone directory of a certain city and we assume that 90,000 
among them would express a negative opinion if asked. Let X be the number of nega¬ 
tive opinions obtained in 15 calls made at random (among the 150,000 listed numbers). 
Calculate approximately P[X = 9] if we also assume that nobody was contacted more 
than once. 

(a) 0.1666 (b) 0.1766 (c) 0.1866 (d) 0.1966 (e) 0.2066 

Question no. 19 

A multiple choice examination comprises 30 questions. For each question, five an¬ 
swers are proposed. Every correct answer is worth two points and for every wrong answer 
1/2 point is deducted. Suppose that a student has already answered 20 questions. Then, 
she decides to select the letter a for each of the remaining 10 questions, without even 
reading these questions. If the correct answers are distributed at random among the 
letters a, 6, c, d and e, what is her expected total mark (on 60), assuming that she has 
four chances out of five of having the correct answer to any of the first 20 (independent) 
questions she has already done? 

(a) 26 (b) 28 (c) 30 (d) 32 (e) 36 

Question no. 20 

Let px(x) = (3/4)^ -1 (l/4), for x = 1,2,... . Calculate the expected value of the 
discrete random variable X, given that X is greater than 2. 

(a) 4 (b) 5 (c) 6 (d) 7 (e) 8 


4 


Random vectors 


The notion of random variables can be generalized to the case of two (or more) di¬ 
mensions. In this textbook, we consider in detail only two-dimensional random vectors. 
However, the extension of the various definitions to the multidimensional case is imme¬ 
diate. In this chapter, we also state the most important theorem in probability theory, 
namely the central limit theorem. 


4.1 Discrete random vectors 

The joint probability function 

Px,Y(xj,y k ) :=P[{X = Xj }n{Y = y k }] = P[X = xj,Y = y k ] 

of the pair of discrete random variables (X,Y), whose possible values are a (finite or 
countably infinite) set of points ('Xj. y k ) in the plane, has the following properties: 

(i) Px,Y{xj,y k ) > 0 V(zj,2/ fc ); 

(ii) E*iEr=iPx,rfe,») = l. 

The joint distribution function Fx,y is defined by 

F X ,y(x, y) = P[{X < x} n {Y < y}] = Y Y Px,Y(xj,y k ). 

Xj<x y k <y 


Example 4.1.1. Consider the joint probability function px,Y given by the following 
table: 


M. Lefebvre, Basic Probability Theory with Applications, Springer Undergraduate Texts in Mathematics 
and Technology, DOI: 10.1007/978-0-387-74995-2_4, 

© Springer Science + Business Media, LLC 2009 
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y\ x 

-1 0 1 

0 

1/16 1/16 1/16 

1 

1/16 1/16 2/16 

2 

2/16 1/16 6/16 


We can check that the function px,Y possesses the two properties of joint probability 
functions stated above. Furthermore, given that only the points (—1,0) and (0,0) are 
such that Xj < 0 and < 1/2, we may write that 


Fx,y( 0, l/ 2 ) = Px,y (—1,0) + px,Y (0,0) = -. 


When the function px,Y is summed over all possible values of Y (resp., X), the 
resulting function is called the marginal probability function of X (resp., Y). That is, 

oo oo 

and PriVk) = ^2px,Y(xj,yk)- 

k=l j=1 


Example 4.1.1 (continued). We find that 


X 

-1 0 1 

X 

Px(x) 

1/4 3/16 9/16 

1 


y 

0 1 2 

X 

Pv{y) 

3/16 1/4 9/16 

1 


Definition 4.1.1. Two discrete random variables, X and Y, are said to be indepen¬ 
dent if and only if 

Px,Y{ x j,Uk) = Px(xj)p Y (yk) for any point (Xj,y k ). (4.1) 

Example 4.1.1 (continued). We have that px,y(~ 1,0) = 1/16, px(~ 1) = 1/4 and 
Py( 0) = 3/16. Because 1/16 ^ (l/4)(3/16), X and Y are not independent random 
variables. 

Finally, let Ax be an event defined in terms of the random variable X. For instance, 
Ax = {X > 0}. We define the conditional probability function of Y . given the event 
A x , by 


p Y (y | A x ) = P[Y = y | A x ] 


P[{Y = y}nA x ] 

P[Ax} 


if P[A X ] > 0. 
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Likewise, we define 


Px{x I Ay) = P[X = x\ A y ] = Ay ^ if P[A y ] > 0. 


Remark. If X and Y are independent discrete random variables, then we may write that 
Pv(y | Ax) = Py{u) and p x {x \ A Y ) = Px(x). 


Example 4.1.1 (continued). Let A x = {V = 1}. We have: 


pr(y I V = 1) 


P[{Y = y} n (V = 1}] px.yfry) 
P[X = 1] p x (l) 

( 1/9 if y = 0, 
~tPx,y{ 1,2/) = { 2/9 if y = 1, 

9 l 2/3 if y = 2. 


Example 4.1.2. A box contains one brand A , two brand 5, and three brand C transis¬ 
tors. Two transistors are drawn at random and with replacement. Let X (resp., Y) be 
the number of brand A (resp., brand B ) transistors among the two selected at random. 

(a) Calculate the joint probability function px,Y{%,y)- 

(b) Find the marginal probability functions. 

(c) Are the random variables X and Y independent? Justify. 

(d) Calculate the probability P[X = Y]. 

Solution, (a) The possible values of the pair (X,Y) are: (0,0), (0,1), (1,0), (1,1), 
(0, 2), and (2, 0). Because the transistors are taken with replacement, so that the draws 
are independent, we obtain the following table: 


y\ x 

0 1 

2 

0 

1/4 1/6 1/36 

1 

1/3 1/9 

0 

2 

1/9 0 

0 


For instance, we have: 

PxA 0,0) ® (1/2) (1/2) =1/4 

(by independence of the draws) and 

PX,Y (1, 0) = c - p[Ai n cy + P[Ci n a 2 ] ind = ym ' 2(1/6)(1/2) = 1/6, 

where A^ is the event “a brand A transistor is obtained on the fcth draw,” and so on. 
We can check that the sum of all the fractions in the table is equal to 1. 
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Remark. In fact, the random variables X and Y follow binomial distributions with 
parameters n = 2 and p = 1/6, and with parameters n = 2 and p = 1/3, respectively. 

(b) From the table in part (a), we find that 


X 

0 1 2 

E 

Px{x) 

25/36 5/18 1/36 

1 


y 

0 1 2 

E 

Pv{y) 

4/9 4/9 1/9 

1 


(c) The random variables X and Y are not independent, because, for instance, 

Px,yM =0^ Px (2) Py (2) = (1/36) (1/9). 

Remark. The random variables X and Y could not be independent, because the relation 
0 < X + Y <2 must be satisfied. 

(d) We calculate 

11 13 

P[x = Y] =px,y(0,0) +px,y(l,l) +Px,y(2,2) = - + - + 0= —. 


4.2 Continuous random vectors 

Let X and Y be two continuous random variables. The generalization of the notion of 
density function to the two-dimensional case is the joint density function fx,Y of the 
pair (X,Y). This function is such that 

ry+5 rx+e 

P[x<X <x + e,y<Y <y +5] = / / fx,Y{u, v)dudv 

J y J x 

and has the following properties: 

(i) fx,Y( x >y) > 0 for an y P oint ( x ,v)i 

(ii) IZ fZ fx,y( x ’ V ) dxd V = !• 

Remark. In n dimensions, a continuous random vector possesses a joint density function 
defined on M n (or on an uncountably infinite subset of M n ). This function is nonnegative 
and its integral on M n is equal to 1. 

The joint distribution function is defined by 

Fx,y( x ’V) = p \ x < x X <y\= f f fx,Y{u,v)dv,dv. 

l-oo 1-00 
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Example 4.2.1. Consider the function fx,Y defined by 

fx t Y(x,y) = cxye ~ x2 ~ y2 for x > 0 ,y > 0, 

where c > 0 is a constant. We have that (i) fx,Y{%,y) > 0 for any point (x,y) with 
x > 0 and y > 0 [fx,y(x,y) = 0 elsewhere] and (ii) 


pOO poo 

Jo Jo 


poo poo 

_ 2 _ 2 / _ 2 / _ 2 

cxye~ x ~ y dxdy = c I xe~ x dx / ye~ v dy 

Jo Jo 

c(l/2)(l/2) = c/4. 


r 

e x 

oo- 


e~ y2 

oo- 

i- 

to 

0 - 


l 

to 

0 - 


So, this function is a valid joint density function if and only if the constant c is equal 
to 4. 

The joint distribution function of the pair (X, Y) is given by 

py px 

Fx,y( x iV) = / 4m;e _n ~ v dudv 

Jo Jo 

= / 2 ue~ u du [ 2ve~ v dv 


f x _ 2 f y 
/ 2^xe U du 
Jo Jo 


-n 2 

X- 


-v 2 

y- 

-e u 



-e v 


. 

0 - 



0 - 


(1 — e ~ x )0~e~ y ) 


for x > 0 and y > 0. 

Remark. We have that Fx,y(x , y) = 0 if x < 0 or y < 0. 

The marginal density functions of X and Y are defined by 


r 

fx(x) = / 

J — c 


fxy{x,y)dy and 


fv(y) = [ 

J — ( 


fx,y{x, y) dx. 


Remark. We can easily generalize the previous definitions to the case of three or more 
random variables. For instance, the joint density function of the random vector (X, Y, Z) 
is a nonnegative function fx,Y,z{x,y,z) such that 



fx,Y,z(x, y, z) dxdydz = 1. 


Moreover, the joint density function of the random vector ( X , Y) is obtained as follows: 

/ OO 

fx,Y,z{x, y, z) dz. 

-OO 


Finally, the marginal density function of the random variable X is given by 
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fx(x)= / fx,Y,z(x,y,z)dydz. 

J — oo J — oo 

Definition 4.2.1. 77ze continuous random variables X and Y are said to be indepen¬ 
dent if and only if 

fx,v{x,y) = fx(x)f Y (y) for any point (x,y). (4.2) 


Example 4.2.1 (continued). We have: 

p OO p oo 

fx(x) = / 4xye~ x ~ y dy = 2xe~ x / 2ye~ y dy 

Jo Jo 


—e 


-y 


Jo 

= 2xe~ x for x > 0. 


= 2xe -ar 

Then, by symmetry, we may write that 

fr(y) = 2ye~ y2 for y > 0. 


Furthermore, because 


fx(x)f Y (y) = dxye * 2 y2 = fx, Y {x,y) 

for any point (x,y) (with x > 0 and y > 0). the random variables X and Y are 
independent. 


Finally, let Ay be an event depending only on Y. For example, Ay = {0<F<1}. 
The conditional density function of X, given that Ay occurred, is given by 


fx(x I Ay) = 


f Av fx, Y (x,y)dy . 


P[A y ] 


if P[A y ] > 0. 


If Ay is an event of the form {Y = y}, we can show that 


f x (x \Y = y) = XAX if Jy iy) > 0. 


That is, the conditional density function fx(x \ Y = y) is obtained by dividing the 
joint density function of (X, T), evaluated at the point (#,?/), by the marginal density 
function of Y evaluated at the point y. 

Remarks, (i) If X and Y are two independent continuous random variables, then we 
have: 
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fx(x I Ay) = fx(x) and f Y (y \ A x ) = f Y {y). 

(ii) In general, if X is a continuous random variable, then we can write that 

/ oo 

P[Y eA Y | X = x]fx(x)dx, (4.3) 

-oo 

where Ay is an event that involves only the random variable Y. In the case when X is 
discrete, we have: 

oo 

P[Y e A Y ] = ^2P[Y e a y \ X = x k ]p x (x k ). (4.4) 

k =1 


These formulas are extensions of the law of total probability from Chapter 2. We also 
have, for instance: 


P[Y > V] = J 
-/ 


— OO 

oo 


P[Y >X \ X = x]f x {x)dx 
P[Y >x\X= x\f x (x)dx, 


(4.5) 


and so on. 


Definition 4.2.2. The conditional expectation of the random variable Y, given that 
X = x, is defined by 

( Ejli VjPYiVj | X = x) if (X, Y) is discrete, 

E[Y\X = x]=l 

{ jT^yfriv | X = x)dy if (X,Y) is continuous. 


Remarks, (i) We can show that 


E[Y] = E[E[Y | X]] := 


YlkLi E l Y I x = x k]px{x k ) if X is discrete, 
E[Y | X = x]fx(x)dx if X is continuous. 


(ii) In general, 


my)} = E[E\g(X) I X]] 


for any function g(-). It follows that 


VAR[F] = E[E[Y 2 | V]] - {E[E[Y \ X}}} 2 . 
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Example 4.2.2. Let 

fx.y{x,y) 


k(x 2 + y 2 ) for 0<x<a, 0<y<b, 
0 elsewhere. 


(a) Calculate the constant k. 

(b) Find the marginal density functions of X and Y. Are X and Y independent? 

(c) Obtain the conditional density functions fx{x \ Y = y), fy{y \ X = x), and 

Mv I X < a/2). 

(d) Find the distribution function Fyy(x,t/). 


Solution, (a) We have: 


rb pa p 

/ / k(x 2 + y 2 )dxdy = k 

J o Jo Jo 

l 


i = 


b / x 3 


r b /a 3 


= k j ( y + ay 2 ) dy = k 


xy 


Q Q 1 b 

ay ay 


3 

ry 
3 ' 3 


dy 


Thus, k must be given by 


k = 


a 3 b ab 3 

+ 


(b) We can write that 


fx (•'•) = [ k( 
Jo 


(( ab){a 2 + b 2 )' 


x 2 + y z )dy = k l yx 2 + V — 


= k [ bx 2 + ) for 0 < x < a, 


where k has been calculated in part (a). Similarly, we find that 
Mv) = k (ay 2 + for 0 < y < b. 


Now, we have: 

/x(0)/y(0) = k k (X) = k2< X~ ± 0 = /x,y(0,0). 

Therefore, A and T are not independent random variables. 

Remark. When the joint density function, fx,Y(x,y), is a constant c multiplied by a 
sum or a difference, like x + y, x 2 — y 2 , and so on, the random variables X and Y cannot 
be independent. Indeed, it is impossible to write, in particular, that 
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c(x + y) = f(x)g(y), 

where f{pc) depends only on x and g(y) depends only on y. 

(c) We calculate 

fx,r(x,y) (b) k(x 2 +y 2 ) _ 3 (x 2 +y 2 ) 


fx(x | Y = y) = 


fr(y ) k (f- + ay 2 ) a{a 2 + 3 y 2 ) 
for 0 < x < a and 0 < y < b. Similarly, we find that 

3 (x 2 +y 2 ) 


f Y (y \X = x) = 


b(3x 2 + b 2 ) 


for 0 < x < a and 0 < y < b. Finally, we have: 


/ 0 a/2 k(x 2 + y 2 )dx 
f“ /2 k(bx 2 + ^)dx k{bf + zf |“ /2 } 

a 2 + 12 y 2 


f Y (y I V < a/2) = 




+ xy z 


a/2 

0 


a_ I 
24 ' 2 

^ _l_ ~~ 6a 2 + 46 3 


for 0 < y < b. 


(d) By definition, 

Fx.y(x,y) 


V px 


rv 


k(u 2 + v 2 )dudv = / 

/o </o Jo \ 3 


dv 


y f x 3 


k I ( — + dv = k < ——|—— 


x 3 // { XT/ 3 | xy(x 2 -\-y 2 ) 


ab(a 2 + b 2 ) 


for 0 < x < a and 0 < y < b. Hence, we deduce that (see Figure 4.1) 


F x ,r(x,y) = < 


0 if x < 0 or y < 0, 
xy(x 2 + y 2 ) 


ab(a 2 + b 2 ) 
xb\x 2 + b 2 ) . 


if 0 < x < a and 0 < y < 6, 


«K« 2 +i' 2 ) ifo s a =i‘‘“ d »> ,> - 

+ '{ '' if x > a and 0 < y < b, 
ab{a z + b z ) 

1 if x > a and y > b. 
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Fig. 4.1. Joint distribution function in Example 4.2.2. 


Remark. Corresponding to Formula (3.1) in the one-dimensional case, we have: 

a 2 

Q^FxA x >v) am fx,Y(x,y) 

for any point (x,y) at which the function Fx,y{ x iV) * s differentiable. 


Example 4.2.3. Suppose that X ~ Exp(Ai) and Y ~ Exp(A 2 ) are independent random 
variables. Making use of (4.5), we can write that 


POO 

P[Y >X] = / P[Y >X\X = x]f x {x) dx 

Jo 

POO POO 

*=• / P[Y > x]\ie~ XlX dx= / , 

J 0 J 0 


— X2X \ _ — A1 x 


Ai e 


dx 


f 


> \ 

^ e -(Ai+A 2 )x dx = - 1 

X1 + X2 


(4.6) 


Remark. Note that if Ai = A 2 , then P[X <Y] = 1/2, which actually follows directly by 
symmetry (and by continuity of the exponential distribution). 


4.3 Functions of random vectors 

In Chapter 3, we saw that any real-valued function of a random variable is itself a random 
variable. Similarly, any real-valued function of a random vector is a random variable. 
More generally, n real-valued functions of a random variable or of a random vector 
constitute a new random vector of dimension n. The most interesting transformations 
are the sum, the difference, the product, and the ratio of random variables. 

In general, we must be able to calculate the probability function or the density func¬ 
tion of the new random variable or vector. In this textbook, we treat the case of a single 
function g of a two-dimensional random vector (X,Y). We also give important results 
obtained when the function g is the sum (or a linear combination ) of n independent 
random variables. 
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Sometimes, we only need the mean, for instance, of the new random variable. In 
that case, it is not necessary to first calculate the probability function or the density 
function of g(X,Y). 


4.3.1 Discrete case 

In the particular case when the number of possible values of the pair (X, Y ) of random 
variables is finite , we only have to apply the transformation g to each possible value of 
this pair and to add the probabilities of the points (x, y) that are transformed into the 
same value of g(x,y). 

Example 4.3.1. Consider again the joint probability function in Example 4.1.1: 


y\ x 

-1 0 1 

0 

1/16 1/16 1/16 

1 

1/16 1/16 2/16 

2 

2/16 1/16 6/16 


Let Z = XY. The random variable Z can take on five different values: —2, —1, 0, 1, 
and 2 . The point (— 1 , 2 ) corresponds to z = — 2 , (— 1 , 1 ) corresponds to z — —1, ( 1 , 1 ) 
is transformed into z = 1, (1,2) becomes z = 2, and all the other points are such that 
z = 0. From the previous table, we obtain that 


Z 

-2-10 1 2 

X 

Pz{z) 

2/16 1/16 5/16 2/16 6/16 

1 


It follows that the mean of Z is given by 

^]-(- 24 + (-i)i + o +( i 4 +( 2 ) A,jL, 

As we mentioned above, if we are only interested in obtaining the expected value of 
the new random variable Z, then it is not necessary to calculate the function pz{z). It 
suffices to use Formula (4.11) of Section 4.4: 

oo oo 

£%(v,y)] = ££ g(,x k ,yj)px,Y{x k ,Vj ). 

k =1 j=l 

Here, we obtain that E[XY] = 9/16 (see Example 4.4.1), which agrees with the result 
obtained above for Z = XY. 

Example 4.3.2. Suppose that we toss two distinct and well-balanced tetrahedrons, 
whose faces are numbered 1, 2, 3, and 4. Let X\ (resp., X 2 ) be the number of the face 
on which the first (resp., second) tetrahedron lands, and let Y be the maximum between 
X\ and X 2 . What is the probability function of the random variable Y1 
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Solution. The possible values of Y are 1, 2, 3, and 4. Let Ak (resp., B &) be the event “the 
random variable X\ (resp., X 2 ) takes on the value fc,” for k = 1,..., 4. By independence 
of the events Aj and B & for all j and k , we have: 

py( 1) = P[,4i n Si] = P[^i]P[Bi] = 1 x i « L 
Similarly, by independence and incompatibility , we may write that 

p y ( 2 ) = P[(^i n b 2 ) u (a 2 n Pi) u (A 2 n p 2 )] 

= P[A!]P[P 2 ] + P[Aa]P[Bi] + P[A 2 ]P[B 2 } 
111111 3 

“4 X 4 + 4 X 4 + 4 X 4“l6‘ 

Next, using the equiprobability of the events (and of the intersections), we obtain that 

Py( 3) = ^[(^i H # 3 ) U (A 2 fl Bs) U (As H B 3 ) U (As Pi B 2 ) U (As H Bi)] 

K 1 1 5 

4 4 16 

Finally, because we must have that J2y=iPy(v) = 1? we obtain the following table: 


y 

12 3 4 

Pv{y) 

1/16 3/16 5/16 7/16 


It follows that 


y 

12 3 4 

Fy ( V ) 

1/16 1/4 9/16 1 


When the number of possible values of the pair (X, Y) is countably infinite , it is 
generally much more difficult to obtain the probability function of the random variable 
Z := g(X,Y). Indeed, there can be an infinite number of points (x,y) that correspond 
to the same z = g(x,y ), and there can also be an infinite number of different values 
of Z. However, in the case when the number of possible values of Z is finite, we can 
sometimes calculate pz(z) relatively easily. 

Example 4.3.3. Suppose that the joint probability function of the random vector 
(X, Y) is given by the formula 

e " 2 

Px,y{x, y) = —— for x = 0,1,.. .; y = 0,1,... . 
x\y\ 

Note that X and Y are actually two independent random variables that both follow a 
Poisson distribution with parameter A = 1. Let 








4.3 Functions of random vectors 127 


Z = ^,Y):={1 l X x / y 

In this case, Z has a Bernoulli distribution with parameter p, where 


:= P[X = Y] = Y J 


OO _O 

e 


b (*0 2 


x=0 


We find, using a mathematical software package for instance, that the above infinite 
series converges to e -2 • Jo (2) — 0.3085, where Jo(-) is a Bessel function. We could 
actually obtain a very good approximation to the exact result by adding the first five 
terms of the series, because 


4 


E 

x=0 



0.3085 


as well. 


4.3.2 Continuous case 

Suppose that we wish to obtain the density function of the transformation Z := gi(X , Y) 
of the continuous random vector (X,Y). We consider only the case when it is possible 
to define an auxiliary variable W = g 2 (x,y) such that the system 

* = 9i(x,y), 

W = 92 (x,y) 

possesses a unique solution: x = hi(z,w) and y = li 2 (z,w). The following proposition 
can then be proved. 


Proposition 4.3.1. Let (X,Y) be a continuous random vector and let Z = gi(X,Y) 
and W = g 2 (X , Y). Suppose that the functions x = h\(z : w) and y = /^(z, w) have con¬ 
tinuous partial derivatives (with respect to z and w) for all (z,w) and that the Jacobian 
of the transformation: 


J(z , w) 


dhi/dz dhi/dw 
dh^jdz dh^jdw 


is not identical to zero. Then, we can write that 


fz,w(z,w ) = fx,Y{hi(z,w),h 2 (z,w))\J{z,w)\. 



It follows that 


fx,Y(h 1 (z,w),h 2 (z,w))\J(z,w)\dw. 
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Remarks, (i) We generally choose a very simple auxiliary variable W, for example, 
W = X. 

(ii) In the particular case when gi(x,y) is a linear transformation of x and y , it suffices 
to choose another linear transformation of x and y for the partial derivatives of the func¬ 
tions hi and h 2 to be continuous. Indeed, these partial derivatives are then constants. 
Therefore, they are continuous at any point (z,w). 


Example 4.3.4. Let X ~ U(0, 1) and Y ~ U(0, 1) be two independent random variables 
and let Z = X-\-Y . To obtain the density function of Z, we define the auxiliary variable 
W — X. Then, the system 

2 = x + y, 
w = x 


has the unique solution x = w, y = z — w. Moreover, the partial derivatives of the 
functions h\(z,w) = w and h 2 (z,w) = z — w are continuous V(z,re) and the Jacobian 


J{z, w) 


0 1 


1 -1 


is different from zero for all (z,w). Consequently, we can write that 


fz,w{z,w) = fx,Y(w,z-w )I - 1| '=' fx(w)f Y (z-w) 


fz,w { z , re) = 1-1 if 0 < re < 1 and 0 < z — w < 1. 


Because 0 < z < 2, the set of possible values of w is the interval (0, z), if 0 < z < 1, and 
the interval (z — 1,1), if 1 < z < 2. 

Finally, we have (see Figure ^.2)| 


fz(z) = < 



if 0 < z < 1, 



ldw = 2 — z if 1 < z 


< 2 . 


4.3.3 Convolutions 

Let X be a discrete random variable whose possible values are xi, X 2 , ... . The convolu¬ 
tion of X with itself is obtained by applying the transformation of interest, for instance, 
the sum, the difference, the product, and so on, to the points (aq, Xi), (aq, £ 2 ), • • •, 
(x 2 ,£;l), (x 2 ,X 2 ), .... Therefore, if X can take on n different values, then the transfor¬ 
mation must be applied to n x n = n 2 points. We write X<g>X to denote the convolution 
product of X with itself, X ® X for the convolution sum , and so on. Observe that ob¬ 
taining the distribution of X (Si X, for example, is tantamount to finding the distribution 
of the product X 1 X 2 , where X 1 and X 2 are two independent random variables having 
the same distribution as X. 
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Fig. 4.2. Density function in Example 4.3.4. 


Example 4.3.5. Consider the probability function of the random variable X in Example 
3.4.1: _ 


X 

-1 0 1 

Px{x) 

1/4 1/4 1/2 


Suppose that we want to obtain the distribution of the convolution product of X with 
itself. We find that the possible results of this convolution are —1, 0, and 1. The points 
(—1,1) and (1, —1) correspond to —1, the points (—1, —1) and (1,1) to 1, and the five 
other points to 0. Then, if we define Y = X ® X, we deduce from the above table that 


y 

-1 0 1 

Pv(y) 

1/4 7/16 5/16 


because 


P[Y = -1] = P[X = -1 \P[X = 1] + P[X = 1 \P[X = -1] 
= 2 x (l/4)(l/2) = 1/4, 


and so on. 

Remarks, (i) Note that the result obtained is completely different from the probability 
function of Z := X 2 calculated in Example 3.4.1: 


z 

0 1 

Pz(z) 

1/4 3/4 


(ii) If we calculate the convolution difference of X with itself, we find that the probability 
function of D := X © X is given by 


d 

-2-10 1 2 

PD(d) 

1/8 3/16 3/8 3/16 1/8 
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In general, it is difficult to calculate the distribution of the convolution of a discrete 
random variable X with itself k times, where k is arbitrary, especially if the number 
of values that X can take is (countably) infinite. However, in a few particular cases we 
can obtain a general formula, valid for any integer k. In fact, we can prove some results 
for the sum of independent random variables following the same distribution, but not 
necessarily having the same parameters. The most important results of this type are 
the following, where Xi, ..., X n are n independent random variables. 

(1) If Xi has a Bernoulli distribution with parameter p, for all i, then we have: 

n 

^ Xj ~ B(n,p) for n = 1, 2,... . 

i=1 

More generally, if Xi ~ B(rai,p), for i = 1,..., n, then we find that 

n / n 

Y Xi ~ b I 

i= 1 \i= 1 

(2) If Xi ~ Poi(Ai), for i = 1,..., n, then we may write that 

Xi - Poi 

i= 1 

(3) If Xi ~ Geo(p), for i = 1, —, n, then we have: 

n 

X>i~NB (n,p). 


E 




Suppose now that X and Y are two independent continuous random variables. Let 
Z = X + Y. We can show that the density function of Z is obtained by computing the 
convolution product of the density function of X with that of Y. That is, we have: 

/ oo 

fx(u)f Y (z - u) du. (4.8) 

-OO 

We could use this formula to obtain the density function of Z in Example 4.3.4. 

As in the discrete case, we can prove some results for the sum (and sometimes for 
linear combinations) of independent random variables Xi. We find, in particular, that 
(1) if Xi ~ Exp(A), for i = 1,..., rz, then 

n 

yy Xi ~ G(n, A); (4.9) 

i=1 


(2) if Xi rsj G (ai, A), for i = 1,_, n, then 
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EA‘~ G 


2=1 


i =1 


(3) if Xi ~ N(/i^, of), for i = 1,..., n, then 


n / n n \ 

E a,V ~ N ( ^ aj/ij, ^ afof J , (4.10) 

2=1 \ 2=1 2=1 / 

where ai is a real constant, for all i. 

Remarks, (i) The best way to prove these results is to make use of the characteristic 
function , or of the moment-generating function , which are actually particular mathe¬ 
matical expectations. The characteristic function of the random variable X is defined 
by E[e^ x ], where j := \/^T. 

(ii) A result of the same type, but for the product of independent random variables 
Xi,, X n , is the following: if Xi ~ LN(/i$, of), then 

Ti / Ti Tl 

n*?*~LN 

2=1 \i= 1 2 = 1 



4.4 Covariance and correlation coefficient 


Definition 4.4.1. Let (X, Y) be a pair of random variables. We define the mathemat¬ 
ical expectation of the function g(X, Y) by 


E[g(X, Y)} = < 


EE g(xk,yj)Px,Y(xk,yj) (discrete case), 

fc=i j=i 


(4.11) 


/ OO POO 

/ g(x,y) fx,Y(x,y) dxdy (continuous case). 

-oo J — oo 


Particular cases, (i) If g(X,Y) = X , then we have that E[g(X,Y)\ = E[X] = fix- 

(ii) If g(X,Y) = XY , we then obtain the formula that enables us to calculate the 
mathematical expectation of a product, which is used in the calculation of the covariance 
and of the correlation coefficient. 

(iii) If the function g is a linear combination of the random variables X and Y, that is, 
if we have: 

g(X,Y) = aX-hbY + c, 

where a, 6, and c are real constants, then we easily prove that 
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E\g(X, Y)] = aE[X] + bE[Y] + c 

(if the expected values exist). This formula may be generalized to the case of linear 
combinations of n random variables Xi,... , X n . Moreover, if Y m h(X), for example, 
Y = X 2 , then the above formula enables us to write that 

E[aX + bX 2 } = aE[X} + bE[X% 


as we did in Chapter 3. 

Definition 4.4.2. The covariance of X andY is defined by 


COV[X, Y] = ax,Y = E[(x - g x )(Y - g Y )}. 


Remarks, (i) We can show that 

COV[X, Y] = E[XY] - E[X]E[Y\. 

(ii) If X and Y are two independent random variables and if we can write that g(X, Y) = 
gi(X)g 2 (Y), then we have that E[g(X, Y)] = E[gi(X)]E[g 2 (Y)]. It follows that if X and 
Y are independent, then 


CO Y[X,Y] = E[X]E[Y] - E[X\E[Y] = 0. 

If the covariance of the random variables X and Y is equal to zero, they are not 
necessarily independent. Nevertheless, we can show that, if X and Y are two random 
variables having a normal distribution, then X and Y are independent if and only if 
COV[X, Y] =0. 

(hi) We have that COV[X,X] = E[X 2 } - [E[X]) 2 = VAR[X]. Thus, the variance is a 
particular case of the covariance. However, contrary to the variance, the covariance may 
be negative. 

(iv) If g(X,Y) is a linear combination of X and Y, then we find that 

VAR[ff(V,y)] = VAR [aX + bY + c] = a 2 VAR[X] + b 2 VAR[Y}+2abCOV[X,Y]- (4.12) 

Note that the constant c does not influence the variance of g(X,Y)- Furthermore, if X 
and Y are independent random variables, then we have: 

VAR[aV + bY + c] '"= a 2 VAR[V] + 6 2 VAR[V], 

Finally, Formula (4.12) can be generalized to the case of a linear combination of n 
(independent or dependent) random variables. 
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Definition 4.4.3. The correlation coefficient of X and Y is given by 


CORR[X, Y] = p x ,Y 


COV[X, Y] 

/VARfxlVARfFJ' 


We can show that — 1 < px,Y < 1. Moreover, px,Y = ±1 if and only if we can write 
that Y = aX + 6, where a / 0. More precisely, px,Y = 1 (resp., — 1) if a > 0 (resp., 
a < 0). In fact, px,Y is a measure of the linear relationship between X and Y. Finally, 
if X and Y are independent random variables, then we have that px,Y = 0. 

In the case when X and Y are random variables having a normal distribution, 
we have that px,Y = 0 X and Y are independent. This result is very important 
in practice, because if we showed (by means of a statistical test ) that the two random 
variables follow (approximately) a normal distribution and if we found that their sample 
correlation coefficient is close to zero, then, in the context of statistical procedures, we 
can accept that they are independent. 

Example 4.4.1. Consider the function px,Y given by the following table (see Exam¬ 
ple 4.1.1): 


y\ x 

-1 0 1 

Py(v) 

0 

1/16 1/16 1/16 

3/16 

1 

1/16 1/16 2/16 

4/16 

2 

2/16 1/16 6/16 

9/16 

Px(x) 

4/16 3/16 9/16 

1 


With the help of this table and the marginal probability functions px and py, we 
calculate 

E[X] = -1 x (4/16) + 0 x (3/16) + 1 x (9/16) = 5/16, 

E[Y] = 0 x (3/16) + 1 x (4/16) + 2 x (9/16) = 22/16, 

E[X 2 } = (-1) 2 x (4/16) + 0 2 x (3/16) + l 2 x (9/16) = 13/16, 

E[Y 2 } = 0 2 x (3/16) + l 2 x (4/16) + 2 2 x (9/16) = 40/16, 

1 2 

E[XY}= Y Y x yPYy(^y) 

x=—ly =0 

= 0 + (—1) (1)(1/16) + (—1)(2)(2/16) + 0 + 0 + 0 + 0 
+ (1) (1) (2/16) + (1)(2)(6/16) 

= -1/16 - 4/16 + 2/16 + 12/16 = 9/16. 

It follows that 

VA Rm = £ [^]-( W =|-(^) 2 = | l . 
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and 


VAR[F] = E[Y 2 } - ( E[Y ]) 2 = 
COV[V, Y] = E[XY] - E[X]E[Y] = 


40 / 22\ 2 _ 156 

16 _ V16y _ (16)2 



34 

(16) 2 ■ 


Finally, we calculate 


p x ,Y = CORR[X, Y] = 


34/(16) 2 

183 156 

( 16) 2 ' ( 16)2 J 


34 

x/183 • 156 


0 . 2012 . 


Remark. In general, it is advisable to calculate the means E[X], E[Y ], and E[XY] before 
the expected values of the squared variables. Indeed, if E[XY] — E[X]E\Y] = 0, then 
CORR[X, Y] = 0 (if X and Y are not constants, so that their variances are strictly 
positive). 


Example 4.4.2. The joint density function of the continuous random vector (X,Y) is 

, , v f e~ 2y if 0 < x < 2, y > 0, 

fx,Y(z,V) = \ 0 elsewhere . 


What is the correlation coefficient of X and Y? 


Solution. We can show that, when the joint density function of (X,Y) can be de¬ 
composed into a product of a function of x only and a function of y only, the random 
variables X and Y are independent, provided that there is no relationship between x 
and y in the set Dx,y of possible values of the pair (X, Y). That is, this set Dx,y is of 
the form 

D x ,y = {(x,y) G M 2 : ci < x < c 2 ,h < y < k 2 }, 

where the qs and the fc^s are constants, for i = 1,2. 

Here, the possible values of X and Y are not related and we can write that 


e 2y =g(x)h(y), 


where g{pc) = 1 and h(y) = e ~ 2y . Therefore, we can conclude that X and Y are inde¬ 
pendent. It follows that CORR[X, Y] = 0. 

We can check that X and Y are indeed independent. We have: 



e 


-2 y 



-2 y 


1 

2 


o 


if 0 < x < 2 
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fy(y) = / e 2y dx = 2e 2y if y > 0. 

J o 

Thus, we have that fx,Y{x,y) = fx(%)fY(y) f° r every point (x,y), as required. Note 
that X ~ U(0, 2) and F ~ Exp(2) in this example. 


4.5 Limit theorems 


In this section, we present two limit theorems. The first one is useful in statistics, in 
particular, and the second one is in fact the most important theorem in probability 
theory. 


Theorem 4.5.1. (Law of large numbers) Suppose that Xi,X 2 ,... are independent 
random variables having the same distribution function as the variable X, whose mean 
fix exists. Then, for any constant e > 0 , we have: 


lim P 

n—>oc 


*! + ---+*n 
n 


- hx 


> e 


= 0. 


Remarks, (i) This theorem is known, more precisely, as the weak law of large numbers. 
There is also the strong law of large numbers, for which the expected value of \X\ must 
exist. 

(ii) In practice, the mean fix of the random variable X is unknown. To estimate it, 
we gather many (independent) observations X{ of X. The above result enables us to 
assert that the arithmetic mean of these observations converges (in probability ) to the 
unknown mean of X. 

(iii) We write that the random variables Xi,X 2 ,... are i.i.d. (independent and identi¬ 
cally distributed). 


Theorem 4.5.2. (Central limit theorem) Suppose that X\,... ,X n are independent 
random variables having the same distribution function as the variable X, whose mean 
fix and variance a 2 x exist (ax >0). Then, the distribution of S n := tends to 

that of a normal distribution, with mean n/ix and variance na 2 x , as n tends to infinity. 


Remarks, (i) Let us define 


n v 

x = y^. 

i=i 


Then, we can assert that the distribution of X tends to that of a N (fix,cr x /n) distri¬ 
bution. 







136 4 Random vectors 


(ii) In general, if we add up 30 or more independent random variables then the 
normal distribution should be a good approximation to the exact (often unknown) 
distribution of this sum. However, the number of variables that must be added, to obtain 
a good approximation, actually depends on the degree of asymmetry of the distribution 
of X. 

(iii) We can, under certain conditions, generalize the central limit theorem (CLT) to the 
case when the random variables Xi ,... ,X n are not necessarily identically distributed. 
Indeed, if the mean gx* and the variance a\. of Xi exist for all z, then, when n is large 
enough, we have: 

n / n n \ 

i= 1 V 2—1 i= 1 ) 


and 



Example 4.5.1. An American town comprises 10,000 houses and two factories. The 
demand for drinking water (in gallons) from a given house over an arbitrary day is 
a random variable D such that E[D\ = 50 and VAR[,D] = 400. In the case of the 
factories, the demand for drinking water follows (approximately) a N(10,000, (2000) 2 ) 
distribution for factory 1 and a N(25, 000, (5000) 2 ) distribution for factory 2. Let Di , 
for i = 1,..., 10, 000, be the demand for drinking water from the zth house and i^, for 
i = 1,2, be the demand from factory i. We assume that the random variables Di and 
Fi are independent and we set 

10,000 

X d = ^2 (the domestic demand) 


and 

X t = Xd + Fi + F 2 (the total demand). 

(a) Find the number a such that P[Xd > a] ~ 0.01. 

(b) What should the production capacity of the drinking water treatment plant be if 
we want to be able to satisfy the total demand with probability 0.98? 

Solution, (a) By the central limit theorem, we may write that 

X d « N(10,000(50), 10,000(20 2 )). 


Remark. We assume that the random variables Di are independent among themselves. 
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Then, we have: 

P[X d > a] = 1 - P[X d < a] ~ 1 - P 
where Z ~ N(0,1). It follows that 


Z< 


a — 500,000 
V10,000(20) _ 


P[X d > a] ~ 0.01 


Z< 


a — 500,000 
2000 


- 0.99. 


Now, we find in Table B.3, page 279, that P[Z < 2.33] ~ 0.99. Thus, we have: 


a ~ 500,000 + 2000(2.33) = 504,660. 


(b) By independence, we may write that X t ~ N(/i,cr 2 ), where [see (4.10)] 
/i = 500,000 + 10,000 + 25,000 = 535,000 


and 

cr 2 = (2000) 2 + (2000) 2 + (5000) 2 = 33,000,000. 

Let c be the capacity of the drinking water treatment plant. We seek the value of c 

such that P[X t < c] = 0.98. Because P[Z < 2.055] Tab ^ B ' 3 0.98, proceeding as in part 
(a) we find that 


c ~ 535,000 + v/33,000,000(2.055) ~ 546,805. 

Remark. We see in this example that it is not necessary to know the exact form of the 
function px or fx to be able to apply the central limit theorem. It is sufficient to know 
the mean and the variance of X. 
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Solved exercises 

Question no. 1 

Let 

PxY(x,y) = - if x = 0 or 1, and y — 0,1 or 2. 
’ 6 

Calculate px{%)- 

Question no. 2 

Calculate fx{x \Y = y) if 
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y) — % + y for 0 < x < 1,0 < y < 1. 


Question no. 3 

Suppose that X and Y are two random variables such that E[X] = E[Y] = 0, 
E[X 2 } = E[Y 2 } = 1, and p XY = 1. Calculate CO Y[X,Y]. 

Question no. 4 

Calculate P[X + Y > 1] if 

fx,v{x,y) = 1 for 0 < x < 1,0 < y < 1. 


Question no. 5 

Suppose that 

8 f\\ x+y 

Px,y(x, y) = - ( - j if x = 0 or 1, and y = 1 or 2. 


Calculate E[XY]. 

Question no. 6 

Suppose that X and Y are two random variables such that VAR[X] = VAR[F] = 1 
and CO 'V[X,Y] = 1. Calculate VAR[X - 2 Y]. 

Question no. 7 

Let X ~ N(0,1), Y ~ N(l,2) and Z ~ N(3,4) be independent random variables. 
What distribution does W := X — Y 2Z follow? Also give the parameter(s) of this 
distribution. 


Question no. 8 

Suppose that X ~ Poi(A = 100). What other probability distribution can be used 
to calculate (approximately) p := P[X < 100]? Also give the parameter(s) of this 
distribution, as well as the approximate value of p. 

Question no. 9 

Suppose that X follows a B (n = 100, p = 0.4) distribution. Use a N(40,24) distri¬ 
bution to calculate approximately P[X = 40]. 

Question no. 10 

We define Y = Y^LiXi, where E[Xi] = 0, for i = 1,...,50, and the X^s are 
independent continuous random variables. Calculate approximately P[Y > 0]. 

Question no. 11 

The joint probability function, px,Yi °f P a ^ r (X,Y) is given by the following 
table: _ 


y\x 

-1 0 1 

0 

2 

1/9 1/9 1/9 
2/9 2/9 2/9 


(a) Are the random variables X and Y independent? Justify. 
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(b) Evaluate Tx,y(0,1/2). 

(c) Let Z = X 4 . Calculate pz(z). 
d) Calculate E[X 2 Y 2 }. 

( 2 if 0 < x < y,0 < y < 1, 
( 0 elsewhere. 


Question no. 12 

Let 

fx,v(x,y) = 

Calculate P[X > Y 2 }. 


Question no. 13 

City buses pass by a certain street corner, between 7:00 a.m. and 7:30 p.m., according 
to a Poisson process at the (average) rate of four per hour. Let Y = Y^k=\Xk, where 
Xk is the total number of buses that pass during the kth 15-minute time period, from 
7:00 a.m. 

(a) What is the exact distribution of Y and its parameter(s)? 

(b) What other probability distribution can approximate the distribution of Y? Justify 
and give the parameter(s) of this distribution as well. 


Question no. 14 

We consider the discrete random variable X whose probability function is given by 


X 

0 1 2 

Px(x) 

1/2 1/4 1/4 


Suppose that X\ and X 2 are two independent random variables having the same distri¬ 
bution as X. Calculate P[X i = X 2 \. 

Question no. 15 

The table below gives the function Px,y(%, y) of the pair (X, Y) of discrete random 
variables: _ 


y\x 

0 13 4 

1 

2 

0.10.1 0 0.2 

0.3 0 0.2 0.1 


Calculate P[{X < 5} D {Y < 2}]. 

Question no. 16 

Calculate the covariance of X\ and X 2 if 


fx u x 2 (xi,X2) 


2 — x\ — x 2 for 0 < x\ < 1, 0 < x 2 < 1, 
0 elsewhere. 
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Question no. 17 

Suppose that Y = 1/X, where X is a discrete random variable such that 


X 

1 2 

Px{x) 

1/3 2/3 


We define W = Y\ — where Y\ and I 2 are two independent random variables iden¬ 
tically distributed as Y. Calculate pw{w)- 

Question no. 18 

Let Xi ,..., X n be independent random variables, where X{ has an exponential dis¬ 
tribution with parameter A = 2, for i = 1 ,_, n. Use the central limit theorem to find 

the value of n for which 


P 


£*>5+1 

. 2=1 


0.4602. 


Question no. 19 

A bus passes by a certain street corner every morning around 9:00 a.m. Let X be the 
difference (in minutes) between the time instant at which the bus passes and 9:00 a.m. 
We suppose that X has approximately a N(/i = 0, a 2 = 25) distribution. We consider 
two independent days. Let X^ be the value of the random variable X on the fcth day, 
for k = 1, 2. 

(a) Calculate the probability P[X L — X 2 > 15]. 

(b) Find the joint density function fx 1 ,x 2 ( x u x 2 )- 

(c) Calculate (i) P[X x = 2 | X x > 1] and (ii) P[X x < 2 | X x = 1]. 

Question no. 20 

An assembly comprises 100 sections. The length of each section (in centimeters) 
is a random variable with mean 10 and variance 0.9. Furthermore, the sections are 
independent. The technical specification for the total length of the assembly is 1000 cm 
zb 30 cm. What is approximately the probability that the assembly fails to meet the 
specification in question? 


Question no. 21 

Let 

fx,v{x,y) 


Sx 2 e x y (1 — y) if x > 0,0 < y < 1 , 
0 elsewhere. 


(a) Calculate the functions fx{ x ) and fyiv)- What is the distribution of X and that of 

y? 

(b) Are X and Y independent random variables? Justify. 

(c) Calculate the kurtosis of X. 

(d) Calculate the skewness of Y. 
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Question no. 22 

The following table gives the joint probability function pyy(r, y) of the pair (X, Y): 


y\x 

0 1 2 

-1 

0 

1 

1/9 0 1/9 
2/9 0 2/9 
0 1/3 0 


(a) Find px(x) and py(y). 

(b) Are X and Y independent random variables? Justify. 

(c) Calculate (i) py(y \ X = 1) and (ii) py(y \ X < 1). 

(d) Calculate the correlation coefficient of X and Y. 

(e) Let W - max{X,y}. Find pw(w). 

Question no. 23 

We consider the pair (X,Y) of discrete random variables whose joint probability 
function px,y( x i v) * s gi ven by 


y\x 

1 

2 

3 

2 

1/12 

1/6 

1/12 

3 

1/6 

0 

1/6 

4 

0 

1/3 

0 


Calculate P[X + Y < 4 | X < 2]. 

Question no. 24 

Use a normal distribution to calculate approximately the probability that, among 
10,000 (independent) random digits, the digit “7” appears more than 968 times. 

Question no. 25 

A number X is taken at random in the interval (0,1), and next a number Y is taken 
at random in the interval (0,X], so that 


fx,v{x,y) 


1/x if 0 < x < 1,0 < y < x, 
0 elsewhere. 


(a) Show that 


E[X r Y s ] 


1 

(s + 1) (r T s T 1) 


for r, s = 0,1, 2,... . 

(b) Check the formula in part (a) for r = 2 and s = 0 by directly calculating E[X 2 ]. 

(c) Use part (a) to calculate the correlation coefficient of X and Y. 
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Question no. 26 

The following table gives part of the function Px,y{x, y) of the pair (X, Y) of discrete 
random variables: _ 


y\x 

0 1 2 

Py(v) 

-1 

1/16 1/16 

1/4 

0 


1/2 

1 

0 

1/4 

px{x) 

1/4 

1 


We also have: 


y 

-1 0 1 

p Y (y \X = 2) 

1/8 3/8 1/2 


(a) Find P[X = 2]. 

(b) Complete the table of the function px,Y{%,y)- 

(c) We set W = Y + 1. The distribution of W is then a particular case of one of the 
discrete distributions seen in Chapter 3. Find this distribution and give its parameter(s). 


Question no. 27 

The joint density function of the pair (X, Y) of continuous random variables is given 


by 


fx,Y (x,y) 


\xy if 0 < y < x < 2, 
0 elsewhere. 


(a) Calculate E[1/XY}. 

(b) Calculate E[X 2 ]. 

(c) What is the median, x m , of the random variable XI 

Question no. 28 

A device is constituted of two independent components connected in parallel. The 
lifetime X (in years) of component no. 1 follows an exponential distribution with pa¬ 
rameter A = 1/2, whereas the lifetime Y (in years) of component no. 2 has a Weibull 
distribution with parameters A = 2 and f3 = 2. That is, 

fr(y) = 4 ye~ 2y2 for y > 0. 


Calculate the probability that the device lasts less than one year. 

Question no. 29 

We take 100 numbers at random in the interval [0,1]. Let S be the sum of these 
100 numbers. Use the central limit theorem to calculate approximately the probability 
P[45 < S < 55]. 
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Question no. 30 

The number of floods that occur in a certain region over a given year is a random 
variable having a Poisson distribution with parameter a = 2, independently from 

one year to the other. Moreover, the time period (in days) during which the ground 
is flooded, at the time of an arbitrary flood, is an exponential random variable with 
parameter A = 1/5. We assume that the durations of the floods are independent. Use 
the central limit theorem to calculate (approximately) the probability that 

(a) over the course of the next 50 years, there will be at least 80 floods in this region 
(without making a continuity correction); 

(b) the total time during which the ground will be flooded over the course of the next 
50 floods will be smaller than 200 days. 


Exercises 


Question no. 1 

Telephone calls arrive at an exchange according to a Poisson process with rate A per 
minute. We know, from past experience, that the probability of receiving exactly one 
call during a one-minute period is three times that of receiving no calls during the same 
time period. We consider 100 consecutive one-minute time periods and we designate by 
U the number of periods during which no calls were received. 

(a) Use a normal approximation to calculate P[U = 5]. 

(b) Use the central limit theorem to calculate approximately 



where Xi is the number of calls received during the ith one-minute period, for i = 

1 ,..., 100 . 

Question no. 2 


Let 



be the joint density function of the random vector (X, Y). 

(a) Find the constant k. 

(b) Obtain the marginal density functions of X and Y. 

(c) Calculate VAR[X] and VAR[U]. 

(d) Calculate the correlation coefficient of X and Y. 
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Question no. 3 

In a bank, an automatic teller machine (ATM) enables the customers to withdraw 
$50 or $100 banknotes. It may also happen that a given customer cannot withdraw any 
money if her account is without funds or if the customer in question made an error when 
using the ATM. The number X of customers using the ATM in a five-minute interval 
is a random variable whose probability function px{%) is 


X 

0 1 2 

Px{x) 

0.3 0.5 0.2 


Furthermore, we observed that the total amount Y of money withdrawn in a five-minute 
interval is a random variable whose conditional probability function py(y \ X = x) is 
given by 


y 

0 50 100 150 200 

cT 
II II II 
^ ^ ^ 

1 0 0 0 0 

0.1 0.7 0.2 0 0 

0.01 0.14 0.53 0.28 0.04 


(a) Are the random variables X and Y independent? Justify. 

(b) Calculate the probability P[X = 1 ,Y = 100]. 

(c) Calculate the probability P[Y = 0]. 

(d) Find the average number of customers using the ATM in a one-hour period. 

Question no. 4 

A private club decides to organize a charity casino night. The organizers decide to 

• ask their members to cover the overhead costs; 

• to admit only 1000 players, each of them with the same initial stake 6 (in thousands 
of dollars); 

• to choose games such that the gross winnings Xi (in thousands of dollars) of the ith. 
player are uniformly distributed on the interval (0,30/2). 


Indication. We have that the mean of a U(0, 30/2) distribution is 30/4 and its variance 
is equal to 30 2 /16. 

(a) Let Y be the total gross winnings of the 1000 players. Give the approximate distri¬ 
bution of T, as well as its parameters. 

(b) Determine the amount 0 that each player must pay in order that the net profit (in 
thousands of dollars) of the casino be greater than 50 with probability 0.95. 


Question no. 5 

A certain freeway has three access roads: A, B, and C (see Figure 


.3) 


The number 


of cars accessing the freeway over a one-hour period, via the three access roads, is 
defined by random variables denoted by Xa, Xb , and Xq and having the following 
characteristics: 
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B 



Fig. 4.3. Figure for Exercise no. 5. 



X.4 A' b X c 

Mean 

Standard deviation 

800 1000 600 
40 50 30 


Let us designate by X the total number of cars accessing the freeway over a one-hour 
period. 

(a) Calculate 

(i) the mean of X and 

(ii) the standard deviation of X, assuming that the random variables Xa, Xb , and 
Xc are pairwise independent; 

(iii) the probability that the random variable X takes on a value between 2300 and 
2500 if we suppose that the variables X^,X#, and Xq are independent and (approxi¬ 
mately) normally distributed; 

(iv) the probability that X is greater than 2500, under the same assumptions as 
above. 


(b) Let Y be the number of times that X is greater than or equal to 2500 (under the 
same assumptions as above) over 100 (independent) one-hour periods. 

(i) Give the distribution of Y and its parameters. 

(ii) Calculate, using an approximation based on a normal distribution, the probabil¬ 
ity that the random variable Y is greater than or equal to 10. 

(c) Calculate 

(i) the mean of X and 

(ii) the standard deviation of X if we suppose that the random variables X^,X#, 
and Xc are normally distributed and that the correlation coefficients of the three 
pairs of random variables are CORR[X^,X^] = 1/2, CORRfX^Xc] = 4/5, and 

CORR[X b ,X c ] = -1/2. 


Question no. 6 

The joint density function of the pair (X, Y) of random variables is defined by (see 
Figure cm 


fx,Y 



3/4 if — 1 < x < 1, x 2 < y < 1, 
0 elsewhere. 
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(a) Calculate 

(i) the marginal density functions of X and Y ; 

(ii) the correlation coefficient of X and Y. 

(b) Are the random variables X and Y independent? Justify. 

Question no. 7 

Suppose that 


fx,r(x,y) 


4/-7T if0<a;<l,0<y< V 2x — x ' 2 , 
0 elsewhere. 


Find the conditional density function of y, given that X = x. 

Question no. 8 

Calculate the mathematical expectation of (X + Y ) 2 if 


fX,Y 



|(x + y) for 0 < x < 2 , 0 < y < 2 , 
0 elsewhere. 


Question no. 9 

Suppose that X and Y are two random variables such that VAR[X] = VAR[F] = 1 . 
We set Z = X—| Y. Calculate the correlation coefficient of X and Z if COV[X, Z\ = 1 / 2 . 

Question no. 10 

A fair coin is tossed until “heads” is obtained, then until “tails” is obtained. If we 
assume that the successive tosses are independent, what is the probability that the coin 
has to be tossed exactly nine times? 

Question no. 11 

Suppose that X\ ^ N( 2 ,4), X^ ~ N(4, 2 ), and X 3 ^ N(4,4) are independent random 
variables. Calculate the 75th percentile of the random variable Y := X\ — 2X^ + 4 X 3 . 
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Question no. 12 

The lifetime of a certain type of tire follows (approximately) a normal distribution 
with mean 25,000 km and standard deviation 5000 km. Two (independent) tires are 
taken at random. What is the probability that one of the two tires lasts at least 10,000 
km more than the other? 

Question no. 13 

A factory produces articles whose average weight is equal to 1.62 kg, with a standard 
deviation of 0.05 kg. What is (approximately) the probability that the total weight of 
a batch of 100 articles is between 161.5 kg and 162.5 kg? 

Question no. 14 

The joint density function of the pair (X, Y) of random variables is 



We find that E[X] = 7/12 and VAR[X] = 11/144. Calculate the correlation coefficient 
of X and Y. 

Question no. 15 

Let X ~ N(2,1) and Y ~ N(4,4) be two independent random variables. Calculate 
P[\2X-Y\ < v/8]. 

Question no. 16 

Let Xi, for i = 1,2, ...,100, be independent random variables having a gamma 
distribution with parameters a = 9 and A = 1/3. Calculate approximately P[X > 26], 


where X := ^ X,. 


Question no. 17 

Let X be the number of “do” loops in a FORTRAN program and let Y be the 
number of attempts needed by a beginner to get a working program. Suppose that the 
joint probability function of (X, Y) is given by the following table: 


x\y 12 3 


0 0.05 0.15 0.10 

1 0.10 0.20 0.10 

2 0.15 0.10 0.05 


(a) Calculate E[XY]. 

(b) Evaluate the probability P[Y > 2 | X — 1]. 

(c) Are the random variables X and Y independent? Justify. 
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Question no. 18 

Let 


fx,r(x,y) = 


6x if 0 < x < 1, x < y < 1, 
0 elsewhere 


be the joint density function of (X, F). 

(a) Calculate the marginal density functions of X and Y. 

(b) Evaluate the probability P[XY < 1/4]. 

Question no. 19 

A device is made up of two independent components. One of the components is placed 
on standby and begins to operate when the other one fails. The lifetime (in hours) of 
each component follows an exponential distribution with parameter A = 1/2000. Let X 
be the lifetime of the device. 

(a) Give the distribution of X and its parameters. 

(b) What is the mean of XI 

Question no. 20 

The weight (in kilograms) of manufactured items follows approximately a normal 
distribution with parameters fi = 1 and a 2 = 0.02. We take 100 items at random. Let 
Xj be the weight of the j th item, for j = 1, 2,..., 100. We suppose that the Xj s are 
independent random variables. 

(a) Calculate P[X\ — X^ < 0.05]. 

(b) Find the number b such that P[X\ + X 2 < b] = 0.025. 

(c) Calculate approximately, using a normal distribution, the probability that exactly 
70 of the 100 items considered have a weight smaller than 1.072 kg. 

Question no. 21 


Let 



be the joint density function of the random vector (X, Y). 

(a) Find f x (x) and f Y {y ). 

(b) What is the correlation coefficient of X and Y? Justify. 

(c) What is the 50th percentile of Y? 

(d) Calculate P[Y < e x ). 

Question no. 22 

The time T (in years) elapsed between two major power failures in a particular region 
has an exponential distribution with mean 1.5. The duration X (in hours) of these major 
power failures follows approximately a normal distribution with mean 4 and standard 
deviation 2. We assume that the failures occur independently of one another. 
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(a) Given that there were no major power failures during the last year, what is the 
probability that there will be no major power failures over the next nine months? 

(b) How long, at most, do 95% of the major power failures last? 

(c) Calculate the probability that the duration of the next major power failure and that 
of the following one differ by at most 30 minutes. 

(d) Calculate the probability that the longest major power failure, among the next three, 
lasts less than five hours. 

(e) Use the central limit theorem to calculate (approximately) the probability that the 
30th major power failure from now occurs within the next 50 years. 

Question no. 23 

Suppose that 



is the joint density function of the random vector (X, Y). 

(a) Find the marginal density functions of X and Y. 

(b) Calculate P[X - Y < 1 / 2 ]. 

(c) Are X and Y independent? Justify. 

(d) Calculate E[XY]. 

Question no. 24 

Let X be the number of customers of a car salesman over a one-day period. Suppose 
that X has a Poisson distribution with parameter A = 3. Furthermore, suppose that one 
customer in five, on average, buys a car (on a given visit), independently of the other 
customers. Let Y be the number of cars sold by the salesman in one day. 

(a) Given that the salesman had five customers during a given day, what is the proba¬ 
bility that he sold exactly two cars? 

(b) What is the average number of cars sold by the salesman in a one-day period? 
Justify. 

Indication. We have that E[Y] = I ^ = x \P[X = x\. Moreover, knowing 

that X = x, Y is a binomial random variable. 

(c) What is the probability that the salesman sells no cars during a given day? 

Indication. We have that = ^' 

(d) Knowing that the salesman sold no cars during a given day, what is the probability 
that he had no customers? 

Question no. 25 

Let Xi, X 2 , ..., X 50 be independent random variables having an exponential distri¬ 
bution with parameter A = 2 . 

(a) Calculate P[X\ > 4 | X x > 1], 
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(b) Let 5 = E?-, A'„ 


(i) Give the exact probability distribution of S', as well as its parameter(s). 

(ii) Calculate approximately, using the central limit theorem, P[S < 24]. 

Question no. 26 

Let X\ ~ N(0,1) and X^ ~ N(l,3) be two independent random variables. 

(a) Calculate P\\X t - X 2 \ > 1]. 

(b) What is the 90th percentile of Y := X\ + X 2 ? 

Question no. 27 

The joint density function of the random vector (X, Y, Z) is given by 



k[{x/y) + z\ if 0 < x < 1 , 1 < y < e, — 1 < 2 < 1 , 


fx,Y,z(x, y, z 


0 elsewhere. 


(a) Find the constant k. 

(b) Show that 



(c) Are X and Y independent random variables? Justify. 

(d) Calculate the mathematical expectation of Y/X. 

Question no. 28 

Let X be a random variable following a gamma distribution with parameters a = 25 
and A = 1/2. 

(a) Calculate the probability P[X < 40] by making use of a Poisson distribution. 

(b) Use the central limit theorem to calculate (approximately) P[40 < X < 50]. Justify 
the use of the central limit theorem. 

Question no. 29 

Let X be a random variable having a binomial distribution with parameters n = 100 
and p = 1/2, and let Y be a random variable following a normal distribution with 
parameters fi = 50 and a 2 = 25. 

(a) Calculate approximately P[X < 40]. 

(b) Calculate the probability P[X = Y]. 

(c) What is the 33rd percentile of Y1 

Question no. 30 

The table below presents the joint probability distribution of the random vector 
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x\y 


1 2 3 


0 

1 

2 


1/9 2/9 1/9 
1/18 1/9 1/18 
1/6 1/18 1/9 


Calculate E[2XY], 

Question no. 31 


Let 


fx,v{x,y) = 


4 xy if0<x<l,0<?/<l, 


0 elsewhere 


be the joint density function of the random vector (X, Y). Calculate the probability 
P[X 2 + Y 2 < 1/4]. 

Question no. 32 

Let Xi, X 2 , and X 3 be random variables such that (i) X 1 and X 2 are independent, 
(ii) VAR[X*] = 2, for i = 1,2,3, (hi) COV[Xi,X 3 ] = 1/2, and (iv) COV[X 2 ,X 3 ] = 1. 
Calculate VAR[Xi + X 2 + 2X 3 ]. 

Question no. 33 

The lifetime (in years) of a certain machine follows approximately a N(5,4) distri¬ 
bution. Use the central limit theorem to calculate (approximately) the probability that 
at most 10, among 30 (independent) machines of this type, last at least six years. 

Question no. 34 

The joint density function of the random vector (X, Y) is given by 



(a) Check that fx,Y(%,y) is a valid joint density function. 

(b) Calculate the marginal density functions fx(%) and fy{y )• 

(c) Are X and Y independent? Justify. 

(d) Calculate P[X 2 +Y 2 > 1/4]. 

Question no. 35 

A fair die is rolled 30 times, independently. Let X be the number of 6s obtained and 
Y be the sum of all the numbers obtained. 

(a) Use a Poisson distribution to calculate (approximately) P[X > 5] (even if the 
probability of success is relatively large). 

(b) Use the central limit theorem to calculate (approximately) P[100 < Y < 111]. 

Indication. If W is the number obtained on an arbitrary roll of a fair die, then we have 
that E[W\ = 7/2 and VAR[VE] = 35/12. 
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Question no. 36 

Let 


fx(x) = | 


1/10 if 0 < x < 10, 


0 elsewhere 


and 


Py(v) = 1 1// 0 1 ° 


otherwise. 


if 2/ = 1) 2,, 10, 


Suppose that X and Y are independent random variables. Calculate (a) the probability 
P[X > Y] and (b) VAR[XF]. 

Question no. 37 

Suppose that X and Y are two random variables such that VAR[X] = VAR[F] = 
1 and COV[X, Y] = 2/3. For what value of k is the correlation coefficient of X and 
Z := X + kY equal to 2/3? 

Question no. 38 

An electronic device is made up of ten components whose lifetime (in months) fol¬ 
lows an exponential distribution with mean 50. Suppose that the components operate 
independently of one another. Let T be the lifetime of the device. Obtain the density 
function of T if 

(a) the components are connected in series; 

(b) the components are connected in parallel; 

(c) the components are placed in standby redundancy. That is, only one component 
operates at a time and, when it fails, it is immediately replaced by another component 
(if there remains at least one working component). 

Question no. 39 

Electric light bulbs bought to illuminate an outside rink have an average lifetime of 
3000 hours, with a standard deviation of 339 hours, independently from one light bulb 
to the other. Suppose that the lifetime of the light bulbs follows approximately a normal 
distribution. 

(a) If it is more economical to replace all the light bulbs when 20% among them are 
burnt out, rather than to change the light bulbs when needed, after how many hours 
should we replace them? 

(b) Suppose that only the burnt-out light bulbs have been replaced after t\ hours, where 
t\ is the time when 20% of the light bulbs should be burnt out. Find the percentage of 
light bulbs that will be burnt out after \t\ additional hours. 

Question no. 40 

The continuous random vector (X,Y) has the following joint density function: 



(a) Calculate the marginal density function of Y. 


4.6 Exercises for Chapter 4 153 


(b) Find the 40th percentile of Y. 

(c) Let Z = Y 3 . Obtain the density function of Z. 

(d) Calculate the probability P[X < Y]. 

Question no. 41 

A fair die is tossed twice, independently. Let X be the number of 5s and Y be the 
number of 6s obtained. Calculate 

(a) the joint probability function Px,y{%, y), for 0 < x + y < 2; 

(b) the function Fx,y{ 1/2, 3/2); 

(c) the standard deviation of 2 X ; 

(d) the correlation coefficient of X and Y. 

Question no. 42 

Let _ 


X 1 

-2-11 2 

PXt (xi) 

1/3 1/6 1/3 1/6 


and 


x 2 

0 1 

px 2 (x 2 I X 1 = -2) 
Px 2 (x 2 1 Xi = -1) 

PX 2 (x 2 | x 1 s= 1) 
Px 2 (x 2 1 Xi = 2) 

1/2 1/2 
1/2 1/2 
1 0 

0 1 


(a) Calculate 

(i) the marginal probability function of X 2 ; 

(ii) the probability px 2 { x 2 I {W = —2} U {Xi = 2}), for x 2 = 0 and 1. 

(b) Let Y = 2X 1 + X\. Find the distribution function of Y. 

Question no. 43 

The duration X (in hours) of the major breakdowns of a given subway system 
follows approximately a normal distribution with mean /i = 2 and standard deviation 
a = 0.75. We assume that the durations of the various breakdowns are independent 
random variables. 

(a) Calculate (exactly) the probability that the duration of each of more than 40 of the 
next 50 major breakdowns is smaller than three hours. 

Remark. This question requires the use of a pocket calculator or a software package. 

(b) Use a normal distribution to calculate approximately the probability in part (a). 

Question no. 44 

Suppose that X \,..., Xg are independent random variables having an exponential 
distribution with parameter A = 1/2. 

(a) Calculate the probability 
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P 


X±+X 2 + x 3 


< 1.5 


x 4 + • • • + x 9 


(b) Let Y = X 1 +X 2 . 

(i) Calculate P[Y < y \ X\ = aq], where x\ > 0 and y > 0. 

(ii) Obtain the conditional density function fy{y \ X\ = aq). 

Question no. 45 

We consider a discrete random variable X having a hypergeometric distribution with 
parameters N = 10, n = 5, and d = 2. 

(a) We define Y = X 2 . Calculate the correlation coefficient of X and Y. 

(b) Let Xi, X 2 ,..., X$ be independent random variables following the same distribution 
as X. We define Z = X\ + • • • + Xg. Calculate the probability P[8 < Z < 12]. 

Remark. This question requires the use of a software package. 

Question no. 46 

Suppose that the joint probability function of the random vector (X, Y) is given by 




Px,y(x,v) 


0 otherwise. 


(a) Obtain the functions px(x), Py(v ), and py(y \ X = 35). 

(b) Calculate the probability P[ 12 < Y < 18 | X = 35] 

(i) exactly (with the help of a software package, if possible); 

(ii) using an approximation based on a normal distribution. 

(c) Calculate the probability P[X < 2 | X > 2]. 

Question no. 47 

A system is made up of three components, Ci, C2, and C3, connected in parallel. The 
lifetime T\ (in years) of component C\ follows (approximately) a normal distribution 
with parameters /x = 4 and a 2 = 2.25. In the case of component C 2l its lifetime T 2 has an 
exponential distribution with parameter A = 1/4. Finally, the lifetime X3 of component 
C3 has a gamma distribution with parameters a = 2 and A = 1/2. Furthermore, we 
assume that the random variables Xi, X2, and X3 are independent. 

(a) Calculate the probability that the system operates for more than one year. 

(b) We consider 500 systems similar to the one described above. Calculate, assuming 
that these 500 systems are independent, the probability that 2, 3, 4, or 5 among them 
are down after one year 

(i) exactly (with the help of a software package or a pocket calculator); 

(ii) using an approximation based on a Poisson distribution. 
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(c) Suppose that at first only components C\ and C 2 are active. Component C3 is on 
standby and begins to operate as soon as C\ and C 2 are both down, or after one year if 
Ci or C 2 still operates at that time. Suppose also that if C\ or C 2 still operates after one 
year, then X3 ^ G(2,1/2); otherwise X3 has an exponential distribution with parameter 
A = 1 . Calculate the probability that component C3 operates for at least two years. 

Question no. 48 

Suppose that 


fx,Y 



1/8 if x > 0, y > 0, 0 < x + y < 4, 
0 elsewhere 


is the joint density function of the random vector (X, Y). 

(a) Obtain the marginal density function fx{x). 

(b) Calculate (i) the expected value of X, (ii) its variance, (iii) its skewness /?i, and (iv) 
its kurtosis fo- 

(c) Calculate the correlation coefficient of X and Y. Are the random variables X and 
Y independent? Justify. 

Question no. 49 

In a particular region, the daily temperature X (in degrees Celsius) during the 
month of September has a normal distribution with parameters jli = 15 and a 2 = 
25. Calculate, using an approximation based on a normal distribution, the probability 
that the temperature exceeds 17 degrees Celsius on exactly 10 days over the course of 
September. 

Question no. 50 

Consider the joint density function 


fx,Y 



90x 2 y(l — y) if 0 < y < 1,0 < x < y, 
0 elsewhere. 


(a) Calculate the marginal density functions fx(%) and /y(y). 

(b) Are X and Y independent random variables? Justify. 

(c) Calculate the covariance of X and Y. 

Question no. 51 

Let X\ and X2 be two discrete random variables whose joint probability function is 
given by 

2 m* 1 m*V 5 \ 2-11-12 

P x„ A ' I (x„x 2 ), xite!(2 _ ii _ X2)! y 

if xi e {0,1, 2}, X 2 E {0,1, 2}, and x\ + X 2 < 2 [and px 1 ,x 2 { x i-> x 2 ) = 0, otherwise]. 

(a) Let Yi = X\ + X2 and Y2 = Xi — X 2 . Find the probability functions of Y\ and Y 2 . 
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(b) Calculate the function Py 2 (v 2 | Y\ =' 2). 

Question no. 52 

A number X is taken at random in the interval [—1,1], and then a number Y is 
taken at random in the interval [—1,X]. 

(a) Find fx,v(x,y) and f Y (y)- 

(b) Calculate (i) E[{X + 1)Y] and (ii) E[Y], 

(c) Use part (b) to calculate COV[X, Y]. 

Question no. 53 

The storage tank of a gas station is usually filled every Monday. The capacity of the 
storage tank is equal to 20,000 liters. The gas station owner is told, on a given Monday, 
that there will be no gasoline delivery the next Monday. What is the probability that 
the gas station will not be able to satisfy the demand for a two-week period (with the 
20,000 liters) if the weekly demand (in thousands of liters) follows 

(a) an exponential distribution with parameter A = 1/10? 

(b) a gamma distribution with parameters a = 5 and A = 1/2? 

Question no. 54 

A random variable X has the following probability function: 


X 

-1 0 1 

Px{x) 

1/8 3/4 1/8 


Let X\ and X 2 be two independent random variables distributed as X. We set Y = 

x 2 -x 1 . 

(a) Obtain the joint probability function of the pair (Xl,X 2). 

(b) Calculate the correlation coefficient of X 1 and Y. 

(c) Are the random variables X\ and Y independent? Justify. 

Question no. 55 

Let X \,..., Xio be independent random variables having an exponential distribution 
with parameter A = 1. We define Y = Xi. 

(a) Evaluate, without making use of the central limit theorem, the probability P[Y < 5]. 

(b) Use the central limit theorem to evaluate P[Y > 10]. 

Multiple choice questions 


Question no. 1 

Let X ~ N(0,1) and Y ~ N(l, 4) be two random variables such that COV[X, Y] = 1. 
Calculate P[X + Y < 12]. 

(a) 0 (b) 0.6915 (c) 0.8413 (d) 0.9773 (e) 1 
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Question no. 2 

Calculate P[ 3 < X + Y < 6 ] if X ^ Poi(l) and Y ~ Poi( 2 ) are independent random 
variables. 

(a) 0.269 (b) 0.493 (c) 0.543 (d) 0.726 (e) 0.916 

Question no. 3 

Use the approximation of the binomial distribution by a normal distribution to 
calculate P[X < 12 ], where X ~ B(n = 25, p = 1 / 2 ). 

(a) 0.4207 (b) 0.4681 (c) 0.5 (d) 0.5319 (e) 0.5793 

Question no. 4 


Let 



Find f x (x). 

(a) 2^\/4 — x 2 if — 2 < x < 2 (b) 7 X ^/4 — x 2 if 0 < x < 2 

(c) X ^/4 — x 2 if — 2 < x < 2 (d) X ^/4 — x 2 if 0 < x < 2 

(e) ^ V4 — x 2 if —2 < x < y 

Question no. 5 

Suppose that px{% \ Y = y) = 1/3, for x = 0 ,1, 2 and y = 1 , 2 , and that py{y) = 1/2, 
for y = 1, 2. Calculate Px,y{ 1, 2). 

(a) 1/6 (b) 1/3 (c) 1/2 (d) 2/3 (e) 1 

Question no. 6 

Let X be a random variable such that E[X n ] = 1/2, for n = 1,2,... . We set 
Y = X 2 . Calculate px,Y’ 


(a) 0 (b) 1/4 (c) 1/2 (d) 3/4 (e) 1 


Question no. 7 

Suppose that the random variable X is such that E[X] = VAR[X] = 1 . Calculate 


(approximately) the probability P ^ < 56 , where Xi, X 2 , ..., X 49 are indepen¬ 


dent random variables distributed as X. 

(a) 0.5 (b) 0.6554 (c) 0.8413 (d) 0.8643 (e) 1 

Question no. 8 

We define W = 3X + 2 Y — Z, where X, U, and Z are independent random variables 
such that o\ = 1 , = 4, and = 9. Calculate oyv- 

(a) a/ 2 (b) 4 (c) /20 (d) \/34 (e) 10 
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Question no. 9 

We consider the joint density function 


fx,v(x,y) = 


1/4 if 0 < x < 2,0 < y < 2, 


0 elsewhere. 


Calculate P[X > 2 Y], 

(a) 1/8 (b) 1/4 (c) 1/2 (d) 3/4 (e) 7/8 

Question no. 10 


Calculate P[2X < Y] if 


fx,Y( X ’V) 


{ 


0 elsewhere. 


2 for 0 < x < y < 1 , 


(a) 0 (b) 1/4 (c) 1/2 (d) 3/4 (e) 1 


Question no. 11 

We define X= max{Xi,X 2 }, where X\ and X 2 are the numbers obtained by simul¬ 
taneously rolling two fair dice. That is, X is the greater of the two numbers observed. 
Calculate E[X}. 

(a) 91/36 (b) 3.5 (c) 4 (d) 161/36 (e) 4.5 

Question no. 12 

Suppose that 



and 


\ if 0 < y < 2 , 


fr(y) 


0 elsewhere. 


Find fx,v{x,y)- 

(a) | if 0 < x < 4 and 0 < y < 2 (b)^if0<x<4 and 0 < y < 2 

(c) ^ if 0 < x < 2y and 0 < y < 2 (d) ^ if 0 < x < 2y and 0 < y < 2 

(e) | if 0 < x < 4 and 0 < y < 2 

Question no. 13 

Let _ 


x -2 0 2 


Px(x) 1/8 3/4 1/8 


be the probability function of the random variable X. We define Y = —X 2 . Calculate 
the correlation coefficient of X and Y. 


(a) -1 (b) - 1/2 (c) 0 (d) 1/2 (e) 1 
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Question no. 14 

Suppose that X and Y are two independent random variables. We define two other 
random variables by R = aX + b and S = cY + d. For what values of a, 6, c, and d are 
the variables R and S uncorrelated (i.e., pR : s = 0 )? 

(a) none (b) a = b = 1 (c) b = d = 0 (d) a = c = 1 , b = d = 0 (e) all 

Question no. 15 

Suppose that Xi and X 2 are two independent random variables uniformly distributed 
on the interval [0,1]. Let X be the smaller of the two random variables. Calculate 
P[X >1/4]. 

(a) 1/16 (b) 1/8 (c) 1/4 (d) 9/16 (e) 3/4 

Question no. 16 

Calculate P[Xi~\~X 2 < 2 ] if X 1 and X 2 are two independent random variables having 
an exponential distribution with parameter A = 1. 

(a) 0.324 (b) 0.405 (c) 0.594 (d) 0.676 (e) 0.865 

Question no. 17 

Let Xi, ... ,X 36 be independent random variables, where Xi follows a gamma dis¬ 
tribution with parameters am 2 and A = 3, for all i. Calculate (approximately) 
P[ 2/3 < X < 3/4], where X := ± X t . 

(a) 0.218 (b) 0.355 (c) 0.360 (d) 0.497 (e) 0.855 

Question no. 18 

Suppose that Xi,...,X n are independent N(0,1) random variables. What is the 
smallest value of n for which P [— O.ln < 1 ^ < 0-l n ] ^ 0.95? 

(a) 19 (b) 20 (c) 271 (d) 384 (e) 385 

Question no. 19 

Let X\ ~ N(0,1), X 2 ^ N(—1,1), and X 3 ~ N(l, 1 ) be independent random vari¬ 
ables. Calculate P[\Xi + 2X 2 — 3X 3 | > 5]. 

(a) 0.004 (b) 0.496 (c) 0.5 (d) 0.504 (e) 0.996 


Question no. 20 

Calculate approximately, by means of a normal distribution, P Xi < 251 

where X \,..., X 100 a re independent random variables such that Xi has a binomial 
distribution with parameters n = 10 and p = 1/4, for i = 1,..., 100. 

(a) 0.50 (b) 0.51 (c) 0.53 (d) 0.56 (e) 0.59 
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Reliability 


In many applied fields, particularly in most engineering disciplines, it is important to 
be able to calculate the probability that a certain device or system will be active at a 
given time instant, or over a fixed period of time. We already considered many exercises 
on reliability theory in Chapters 2 to 4. In Chapter 2, it was understood that we were 
calculating the reliability of a system at a given time instant to, knowing the reliability 
of each of its components at to- In order to calculate the probability that a machine will 
operate without failure for a given amount of time, we need to know the distribution 
of its lifetime or of the lifetime of its components. It becomes a problem on random 
variables or vectors. In this chapter, we present in detail the main concepts of reliability 
theory. 


5.1 Basic notions 

There are many possible interpretations of the word reliability. In this textbook, it 
always corresponds to the probability of functioning correctly at a given time instant or 
over a given period of time. Moreover, in the current chapter, we are mainly interested 
in the reliability over a certain time interval [0, £]. 

Definition 5.1.1. Let X be a nonnegative random variable representing the lifetime (or 
time to failure) of a system or a device. The probability 

R(x) = P[X > x\ [= 1 — Fx(x)\ for x > 0 

is called the reliability function or survival function of the system. 

Remarks, (i) The function R(x) can also be denoted by S(x). The notation Fx(x) = 
1 — Fx(x) is used as well. 
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(ii) Most often, it is assumed that the random variable X is continuous. However, in 
some applications, the lifetime is measured in number of cycles. Therefore, X is then a 
discrete (integer-valued, to be precise) random variable. Furthermore, if we accept the 
possibility that a device may be defective, then X could take on the value 0 and be a 
mixed type random variable. 

(iii) All discrete distributions considered in Section 3.2 could serve as reliability models. 
In the case of the continuous distributions, we must limit ourselves to the ones that are 
always nonnegative. Therefore, the normal distribution cannot be an exact reliability 
model. Nonetheless, depending on the values of the parameters /i and cr, it can be a good 
approximate model for the survival time of a machine. Furthermore, we can consider 
the truncated normal distribution, defined for x > 0. 

A useful measure of the dependability of a system is its mean lifetime E[X]. In the 
context of reliability theory, E[X] is called the mean time to failure of the system. 

Definition 5.1.2. The symbol MTTF (which stands for Mean Time To Failure,) 
denotes the expected value of the lifetime X of a system. If the system can be repaired, 
we also define the symbols MTBF (Mean Time Between Failures,) and MTTR 
(Mean Time To Repair,). We have that MTBF = MTTF + MTTR. 

Remarks, (i) Suppose that we are interested in the lifetime X of a car. It is obvious that, 
except in case of a very major failure, the car will be repaired when it breaks down. 
When we calculate the quantity MTBF , we assume that, after having been “repaired,” 
a system is as good as new. Of course, in the case of a car, this is not exactly true, 
because cars age and wear. 

(ii) To distinguish between critical and noncritical failures, we can use the more precise 
term Mean Time Between Critical Failures (MTBCF ). Then, MTBF could be inter¬ 
preted as the mean time between failures of any type, that is, critical or noncritical. In 
the context of a computer or data transmission system, we also have the Mean Time 
Between System Aborts ( MTBSA ). 

To calculate the mean lifetime of a system, we can, of course, use the definition of 
the expected value of a random variable. However, it is sometimes simpler to proceed 
as in the following proposition. 

Proposition 5.1.1. Let X be a nonnegative random variable. Then, we have: 

{ zr =0 P[X >k] = sr=o R(k) if xe{ 0 , 1 ,...}, 

E[X) = { 

{ f 0 °° P[X >x]dx = f 0 °° R(x)dx if X e [0, oo). 
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Proof. Consider first the case when X is an integer-valued random variable. We have: 

OG OO OO j 

E[X] := Y^:jP[X = j } = £ =i] = EE P[X = j} 

j =o j=1 j=1 k=1 

OO OO OO OO 

= ££P[X = j/]=£P[X>fc]=£P[X>fc]. 

/c=l j=k k= 1 k=0 


Similarly, if X (is continuous and) belongs to the interval [0, oo), we can write that 

r oo /»oo nt 

E[X):= tf x (t)dt= f x (t)dxdt 

Jo Jo Jo 

n OO /»OO /*OO 

= / / fx(t)dtdx = / P[X > x]dx. 

JO Jcc JO 


Remarks, (i) It is not necessary that the discrete random variable X can take on all 
nonnegative integers. It is sufficient that the set of possible values of X be included in 
{0,1,...}. Likewise, in the continuous case, X must take its values in the interval [0, oo). 
(ii) Often, the formulas in the proposition do not simplify the calculation of the expected 
value of X. For example, it is more complicated to calculate the mean of a Poisson 
random variable from the first formula than from the definition. 


Example 5.1.1. Let X be a geometric random variable with parameter p in the interval 
(0,1). Its possible values are the integers 1 , 2 ... and its mean is equal to 1/p. We saw 
(on p. 64) that 

P[X > k] = (1 -p) k for fc = 0,1,... . 

It follows that (indeed) 


E[X]=J20-p) k 

k=0 


1 

1 - (1 ~P) 


1 

p 


Example 5.1.2. If X r^j Exp(A), we find (see Example 3 . 5 . 2 ) that 


Hence, 


r° 

P[x > x] = / 

J X 


Xe xt dt = e 


— Ate 


for x > 0 . 



oo 


0 


1 

A' 


In Chapter 4, we defined various conditional functions, for example, the function 
fx{x\Y = 7 /), where (X, Y) is a continuous random vector. We can also define functions 
of the type fx{% \ -Ax)? where Ax is an event that involves only the random variable 
X. One such particular conditional density function is important in reliability theory. 
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Definition 5.1.3. Suppose that the lifetime T of a system is a continuous nonnegative 
random variable. The failure rate function (or hazard rate function^) r(t) of the 
system is defined by 

r(t ) = f T (t | T > t) := lim ^-F T (s \ T > t) for t > 0. 
sit ds 


Remarks, (i) The function r(£), multiplied by dt, can be interpreted as the probability 
that a machine, which is t time units old and still operating, will break down in the 
interval (t,t + dt\. Indeed, we have: 


f T (t | T >t) = lim 


P[t < T <t + dt\T>t\ 
dt 


(ii) We assume that the conditional distribution function Ft(s \T > t) is differentiable 
at s G (£, t + dt]. 

(iii) We must take the limit as s decreases to t in the definition, because we have that 
Fxft | T > t) = P[T < t | T > t) = 0. However, we don’t have to take any limit to 
calculate fr(s \ T > t) from Ft(s \ T > £), for s > t. 


Proposition 5.1.2. We have: 


r(t) = MU = MU 

U 1 - F T (t) R(t ) 


for t > 0. 


Proof. By definition, 


F t (s | T>t) = P[T <s\T>t] = 


P[{T < s} n {T > t}] 


P[T > t] 


0 


if s < t. 


Ft{s) — Ft(F) . 


1 — F T (t) 


if s > t. 


Hence, 


/t(s I r > () := ^(*l r>i ) = T^) 
Taking the limit as s decreases to £, we obtain that 


if s > t. 
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Finally, because 


we also have: 


R\t) = ^[1 - F T (t)} = 
r(t) = — for t > 0 . 


Remark. In the discrete case, the failure rate function is given by 

r(k) = for k = 0 , 1 ,... . 

T, j=k px(j) 

Note that 0 < r(k) < 1 for any k , whereas r(t) >0 in the continuous case. 

Example 5.1.3. One of the most commonly used models in reliability theory is the 
exponential distribution, mainly because of its memoryless property (see p. 76). This 
property implies that for a system whose lifetime is exponentially distributed, the failure 
rate function is constant Indeed, if X ~ Exp(A), we have (see the previous example): 

Xe~ xt 

r(t) = —rw = A for t > 0 . 
e At 

In practice, this is generally not realistic. There are some applications though for which 
this is acceptable. For example, it seems that the lifetime of an electric fuse that cannot 
melt only partially is approximately exponentially distributed. The time between the 
failures of a system made up of a very large number of independent components con¬ 
nected in series can also follow approximately an exponential distribution, if we assume, 
in particular, that every time a component fails it is immediately replaced by a new 
one. However, in most cases, the exponential distribution should only be used for t in a 
finite interval 

Example 5.1.4. The geometric distribution is the equivalent of the exponential distri¬ 
bution in discrete time. It also possesses the memory less property. Because P[X > k] = 
P[X > k — 1 ], we calculate (see Example 5.1.1) 

r ( fc ) = ( (l _p)fc-i P =p for fc = i, 2,.... 

Therefore, the failure rate function r{k) is a constant in this case too, as expected. 

The failure rate function of a given distribution is a good indicator of the value of 
this distribution as a model in reliability theory. In most applications, r(t ) should be a 
strictly increasing function of £, at least when t is large enough. 
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Definition 5.1.4. If the random variable X is such that its failure rate function rx(t) 
orrx{k) is increasing (resp., decreasing^) in t ork, then X is said to have an increasing 
failure rate (resp., decreasing failure rate) distribution. 

Notation. We use the acronym IFR (resp., DFR) for Increasing Failure Rate (resp., 
Decreasing Failure Rate). 

Now, making use of Proposition 5.1.2, we obtain that 

[ r(s)ds = — [ ds = — In R(t) + lnR(0). 

Jo Jo R\ s ) 

Moreover, the random variable T [with failure rate function r(t)] being continuous and 
nonnegative, we may write that R( 0) := P[T > 0] = 1. Hence, we may state the 
following proposition. 

Proposition 5.1.3. There is a one-to-one relationship between the functions R(t) and 
r(t): 

r(s)<fs 

Remark. The proposition implies that the exponential distribution is the only continuous 
distribution having a constant failure rate function. 


R(t) = exp < — 


-L 


Example 5.1.5. We can show that the failure rate function r(t) of a lognormal distri¬ 
bution starts at zero [because lim tio frif) = 0], next it increases to a maximum, and 
then 

lim r(t) = 0. 

t—>oo 

So, we must conclude that the lognormal distribution is not a good model for the lifetime 
of a device that is subject to wear, at least not for t large. Indeed, the failure rate should 
generally increase with £, as mentioned above. 


Example 5.1.6. The normal distribution N(/i,cr 2 ) should not be used to model the 
lifetime of a system, unless ji and a are such that the probability that the random 
variable takes on a negative value is negligible. For any values of /i and cr, we can define 
the truncated normal distribution as follows: 


fx(x) 


V / 27r< 


7TCT C 


exp 


{-w 


for x > 0, 


where c is a constant such that J 0 °° fx{x)dx = 1. That is, 


r r°° i -1 i 

U w H =t=w- 


-F(-ll/c r)’ 

where Y ~ N(/i, a 2 ). We can write that X = Y \ {Y > 0}. Note that if fi = 0, then 
c = 2. 
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We find that the failure rate function of a truncated normal distribution is strictly 
increasing, which makes it an interesting model for many applications. 

Example 5.1.7. The Weibull distribution (see p. 77) is a really important model in 
reliability theory and fatigue analysis. We have: 


m 


/ oo 

A/3x ^ _1 exp (— Xx^) dx = exp (—At^) . 


It follows that 


r(£) = 


A^- 1 exp(-A^) 


= A /3t@ 


-l 


for t > 0 . 


1 , we 


exp (—A td) 

Therefore, the Weibull distribution is DFR if f3 < 1 and IFR if (3 > 1. When /3 
retrieve the exponential distribution. 

Although it is true that the failure rate function r{t) should increase as t increases, 
for large enough values of t, in many situations it is first a decreasing function of t. For 
example, the mortality rate of children does indeed decrease at first. There is a greater 
risk that a baby will die at birth or shortly thereafter than when it is six months old, in 
particular. When the child grows older, the death rate is more or less constant for some 
time, whereas it increases for adults. Therefore, the function r(t) looks like a bathtub 
(see Figure 5.1). As mentioned above, the exponential distribution should only be used 



Fig. 5.1. Failure rate function having the shape of a bathtub. 


for t such that t\ < t < £2 < 00 . It is valid for the flat portion of the bathtub. 

Suppose that the lifetime X is defined as follows: 

X = c\X\ + C 2 X 2 + C 3 X 3 , 

where Xi has a Weibull distribution and q > 0, for i = 1,2,3. We assume that c\ + 
C 2 T C 3 = 1. Then, the linear combination of Weibull random variables is called a mixed 
Weibull distribution. The bathtub shape can be obtained by choosing the parameter f3 
of Xi smaller than 1 , that of X 2 equal to 1 , and that of X% greater than 1 . Note that 
X 2 is actually an exponential random variable. 
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Next, sometimes we are interested in the probability that a system will fail during 
a particular time interval. 

Definition 5.1.5. The interval failure rate of a system in the interval is 

denoted by FR(t\,t2) and is defined by 

pp>n t ) = tTl < T. - h. I r > ^l] = ~ 1 

1,2 t 2 - h R(ti) (t 2 - h) 

for 0 < t\ < t2 < oo, where T is a continuous random variable. 

Remark. We have: 

r(£i) = lim FR(ti, t 2 )- 

^2 Rl 


Example 5.1.8. Suppose that T ~ Exp(A). First, we calculate the conditional density 
function f T (t \ T > tf). We find that 


Mt I T > h) = 


fr(t) 


Xe 


— At 


o— Ati 


P[T > ii] e- 
[and fait | T > t\) = 0 for t < t\\. It follows that 

f*t 2 
ft 1 




P[ti < T < t 2 \ T > ti] = [ 2 Ae“ A(t “ tl) dt = 

Jt 1 


for t > ti 


= l _ g-A(t 2 -ti)^ 


Hence, we have: 

1 _ e -A(t 2 -ti) 

FR{ti,t 2 ) = -----. 

^2 — H 

Actually, we could have obtained this result at once from the reliability function R(t) = 
e~ xt . However, we wanted to give the formula for the density function of the shifted 
exponential distribution. A shifted distribution can be used as a model in reliability 
theory in the following situation: suppose that a man buys a device for which there is 
a guarantee period of length t\ > 0 . Then, the buyer is sure that the device in question 
will last at least t\ time units (it may be repaired or replaced if it fails before the end 
of the guarantee period). 

Notice that the function FRitxR^) actually depends only on the difference t2 —t\. 
Therefore, the probability that the system will fail in a given interval depends only on 
the length of this interval, which is a consequence of the memory less property of the 
exponential distribution. Finally, making use of l’Hospital’s rule, we obtain that 


\ _ e —A(t 2 —ti) ^ — e~ Xe 

lim FR(ti, £2) = hm -= lim- 

t2Ri t2iti ^2 — t\ eiO 6 


= lim 

40 


e -Ae A 


1 


= A, 


as should be. 
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Example 5.1.9. Let 

/t(£) = £e _t for £ > 0. 

The random variable T has a gamma distribution with parameters a = 2 and A = 1. 
We calculate 


f c 

m = jf 


+ 


r 


e S ds = (t + l)e * for £ > 0, 


where we used 1’Hospital’s rule to evaluate the limit liim^oo se s . We have: 

te~ l £ 1 

~ (t + l)e-* ~ t + 1 _ ~~ t + l’ 

which is an increasing function for all values of £. Actually, the gamma distribution is 
IFR for any a > 1 (and DFR if 0 < a < 1). 

From the function R(£), we deduce that 


FR{t 1 ,t 2 ) 



(*i + i) 


i 

t2~h' 


Finally, we define another quantity of interest in reliability theory. 


Definition 5.1.6. The average failure rate of a system over an interval [FA 2 ] is 
given by 


AFR(ti,t 2 ) 


If 

ti - h 


t2 - t\ 


Remark. This quantity is an example of a temporal average. 

Example 5.1.10. If T has a Weibull distribution, we have (see Example 5.1.7): 

f r{t)dt= f \/3tP~ 1 dt = A(£^ — ), 

Jt i Jtx 

so that 

A(£o — £?) 

AFR(t 1 ,t 2 ) = y . 1 ■ 

£2 — £1 

If /? = 1, we have that AFR{t\, £2) = A, whereas /3 = 2 implies that AFR{t\, £2) is equal 
to A(£i + £2). Note that, when f3 = 2, the average failure rate is three times as large in 
the interval (£, 2£) than from 0 to £. 
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5.2 Reliability of systems 


In this section, we consider systems constituted of at least two subsystems or compo¬ 
nents that may be connected in series or in parallel When components are connected in 
parallel, we must distinguish between active redundancy and standby (or passive) redun¬ 
dancy. We assume that the components making up the systems considered cannot be 
repaired. When a system fails, it will remain down indefinitely. In fact, it would resume 
operating if the failed components were replaced by new ones. However, here we are 
only interested in the time elapsed until the first failure of the system. 


5.2.1 Systems in series 

Consider n subsystems operating independently of one another. Let Rk(t) be the re¬ 
liability function of subsystem fc, for fc = 1,..., n. If the subsystems are connected in 
series, then each subsystem must be active for the system to operate. Therefore, the 
lifetime T of the system is such that 

T > t Th > t for all fc, 


where X& is the lifetime of subsystem fc. It follows that the reliability function of the 
system is given by (see Proposition 5.1.3) 


R(t) = II Rk(t) = exp 
k =1 


/ 


M s ) H- + r n (s)]ds 


Remarks, (i) In Chapter 2, we could have asked for the probability that a system made 
up of n independent components connected in series operates at a given time instant 
to, given the reliability of each component at to- Define the events S = “the system 
operates at time to” and Bk = “component fc operates at time to.” We have: 


P[S\ = P 


n 

k=1 


Bk 


= n p ^]' 

k=l 


We assumed that the components cannot be repaired, thus this is actually a particular 
case of the previous formula. Indeed, we may write that P[S] = U(to) and P[Bk\ = 
Rk (to)• We must also assume that no components were replaced in the interval (0,to). 
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(ii) When the components making up a system are connected in series, we must assume 
that they operate independently of one another, because the system fails as soon as 
one of its components fails. For example, suppose that there are only two components, 
denoted by A and B. We cannot imagine that there is a certain probability p\(t) [resp., 
p 2 (t)] that component B will be active at time t if A is active (resp., down) at time t. 
If component A is down at time £, then the system stopped operating when A failed 
and remained down henceforth. Furthermore, in continuous time, the probability that 
components A and B will fail exactly at the same time instant is equal to zero. Therefore, 
assuming that a component cannot fail while the system is down, if A is down, then B 
must be active. Unless the lifetime of component B follows an exponential distribution 
(or a geometric distribution, in discrete time), when component A is replaced by a new 
one, component B should be replaced as well, if we want the system to be as good as 
new. Similarly, we cannot suppose that the lifetime of component A has a distribution 
that depends on the lifetime of B. 

(iii) We can write that 

T = min{Ti,r 2 ,... ,T n }. 

The minimum of a sequence of random variables is a special case of what is known as 
order statistics. 

Proposition 5.2.1. Suppose that T\ and T 2 are independent exponential random vari¬ 
ables with parameters X\ and A 2; respectively. We have: 

T := min{Ti,T 2 } ~ Exp(X 1 + A 2 ). 


Proof. We calculate 

P[T > t ] ‘=- P[Ti > t]P[T 2 >t] = e~ Xlt e~ X2t = e -( Al + A2 )* for t > 0. 

It follows that 

f T (t) = ~{1 - P[T > t}} = J t {l - = (A! + A 2 )e-( Al+A2 )* 

for t > 0. ■ 

Remark. The proposition can be generalized as follows: if T & ~ Exp(Afc), for k = 1,..., n, 
and if the T^s are independent random variables, then T := min{Ti, T 2 ,..., T n } ~ 
Exp(Ai + A 2 + • • • + A n ). Therefore, if n independent components having exponentially 
distributed lifetimes are placed in series, it is equivalent to having a single component 
whose lifetime follows an exponential distribution with parameter A equal to the sum 
of the AfcS. Note that because r^{t) = A for km 1,..., n, we have: 

R(t) = exp | - J (Ai H-b A n ) ds j = exp[—(Ai H-b A n )t] = e _At , 


where A := Ai + «• • + A n . 
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Example 5.2.1. If is uniformly distributed on the interval [0,1], for k 
then 



Ids = 1 — t 


for 0 < t < 1. 


It follows that 


1 , 


n, 


n n 

R(t ) = JJ Rk(t) = JJ(1 — £) = (1 — t) n for 0 < t < 1. 

k= 1 k=l 


Because < 1 for all k , we can write that R(t) = 0 if t > 1. 


5.2.2 Systems in parallel 
Active redundancy 

We now consider systems constituted of at least two subsystems connected in parallel. 
Assume first that all subsystems, which may contain one or many components, operate 
from the initial time t = 0. This is called active redundancy. Then, the whole system 
will operate as long as there is at least one active subsystem remaining. We can write 
that the lifetime T of the system is the maximum of the random variables T \,..., T n , 
where n is the number of subsystems placed in parallel. It follows that 

T <t T^<t for k m 1,..., n. 

Hence, if the subsystems operate independently of one another, we have: 

n 

R(t) = 1 - Y[{1 - R k (t)}. 

k=l 


Remark. When the subsystems are connected in parallel, we may consider the case where 
they do not operate independently. This case was already considered in Chapter 2, for 
instance, in Example 2.4.1. 

Example 5.2.2. A device comprises two components connected in parallel and oper¬ 
ating independently of each other. The lifetime X^ of component k has an exponential 
distribution with parameter A&, for k = 1, 2. It follows that 

P[T <t] = P[{T\ < t} n {T 2 < t}} ‘=- P[Ti < t]P[T 2 < t] 

= (1 - e _Alt )(l - e~ X2t ) for t > 0, 

where T is the total lifetime of the system. Hence, the reliability function of the system 
is 
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R(t) = e~ Xlt + e~ X2t - e ~ (Xl+X2)t . 

Note that the maximum of independent exponential random variables does not follow 
an exponential distribution (not even if Ai = A 2 ), because 

f T (t) = Jp[T <t} = X ie ~ Xlt + \ 2 e~ X2t - (Ai + A 2 )e“ (Al+A2)t for t > 0. 

We have: 

I" 00 111 

E[T] = / R(t)dt = - + 

Jo M *2 M + A 2 

In the special case when Ai = A 2 = A, we obtain that E[T] = 1.5/A. That is, the fact 
of installing two identical components in parallel, in this example, increases the mean 
time to failure of the device by 50%. 

Example 5.2.3. Suppose, in the preceding example, that the probability that compo¬ 
nent no. 2 is active at a fixed time instant to > 0 is equal to e~ X2lt ° if component no. 1 
too is active at time to, and to e~ X22t ° if component no. 1 is down at time to- Then, we 
can write (because the exponential distribution is continuous, so that P[Ti = to] = 0) 
that 


P[T < t 0 ] — P[{Ti < t 0 } fl {T 2 < t 0 }] — P[T 2 < t 0 | T\ < t 0 ]P[Ti < t 0 ] 

= (1 - e~ X22to )(l - e _Alto ). 

Moreover, we have: 

P[1 2 ^ ^ 0 ] = P[T 2 < to \ Ti < to\P[T% < to] + P[T 2 < to | T\ > to\P\T\ > to] 

= (1 - e _A22to )(l - e _Alto ) + (1 - e~ X2lto )e~ Xlto . 

In general, the constant A 22 should actually be a function of the exact time at which 
component no. 1 failed. In Example 2.4.1, we provided the numerical probabilities that 
component B operates at an unspecified time instant, given that component A does or 
does not operate at that time instant. Note that the sum e~ X2lt ° + e~ X22t ° can take on 
any value in the interval [ 0 , 2 ]. 

Remark. By conditioning on the failure time of component no. 1, we can show that 

n OO 

P[T 2 >t]= P[T 2 >t\T 1= t} f Tl (r)dr 

Jo 

nt nOO 

= P[T 2 > t\Ti = r\f Tl {r)dT + P[T 2 >t\T 1 = r]f Tl (r)dT. 

Jo Jt 

Suppose that T 2 has an exponential distribution with parameter A 2 as long as component 
no. 1 operates, and an exponential distribution with parameter A 3 (which should be 
greater than A 2 ) from the moment when component no. 1 fails. Then, by the memoryless 
property of the exponential distribution, we can write that 
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P[T 2 > t I Ti » r] 


e Asr e As ^ r ) if 0 < r < £, 
e - A 2 t if r > t. 


If the lifetime of component no. 2 is actually independent of that of component no. 1, 
so that A3 = A2, we obtain: 


P[T 2 > t | Ti 



g—A 2 t g—A 2 (t— t) 
e -\2t 


= e 


Ast if 0 < r < £, 
if r > t. 


That is, 

P[T 2 >t\T 1 = r]= e~ X2t 

We have: 

gOO 

P[T 2 >t]=»/ e~ X2t f Tl (r)dr 

Jo 

as should be. 

Passive redundancy 

Suppose now that a system comprises n subsystems (numbered from 1 to n) connected 
in parallel, but that only one subsystem operates at a time. At first, only subsystem 
no. 1 is active. When it fails, subsystem no. 2 relieves it, and so forth. This type of 
redundancy is called passive (or standby) redundancy. 

Remarks, (i) It is understood that there is a device that sends signals to the system 
instructing it to activate subsystem no. 2 when the first one fails, and so on. In practice, 
this device itself can fail. However, we assume in this book that the signaling device 
remains 100% reliable over an indefinite time period. We also assume that the subsys¬ 
tems placed in standby mode cannot fail before they are activated, although we could 
actually have two failure time distributions: a dormant failure distribution and an active 
failure distribution. 

(ii) Because the subsystems operate one after the other, it is natural to assume (unless 
otherwise stated) that their lifetimes are independent random variables. 

The total lifetime T of the system is obviously given by 

T = Ti+T 2 +• • •+ T n , 
so that its mean time to failure is 


for t > 0 and any r > 0. 


00 

f Tl (t ) dr = e~ X2 \ 



E[T\ = Y J E[T k \. 

k =1 
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Moreover, because the subsystems operate independently of one another, we have: 

n 

VAR[T] ir = ^VAR[T fc ], 

k =1 

In general, it is not easy to find an explicit expression for the reliability function of 
the system, because the density function of X is the convolution of the density functions 
of the random variables Ti,..., X n . In the particular case when X^ Exp(A), for all fc, 
we know (see Subsection 4.3.3) that X has a gamma distribution with parameters n and 
A. Furthermore, making use of Formula (3.3.2), we can write that 

n ~ 1 (\f\k 

R(t ) := P[T >t} = P[Poi(Ai) < n - 1] =^ e_At Af“ for t ^ °- 

k=0 


Example 5.2.4. A system is made up of two identical (and independent) components 
arranged in standby redundancy. If the lifetime X^ of each component follows a uniform 
distribution on the interval (0,1), then (see Example 4.3.4) we can write that 


F T (t) — < 


[ t 2 

/ sds = — if 0 < t < 1 , 

Jo 2 

1 t 2 

- + / (2 — s) ds = 2t — - — 1 if 1 < 

K ^ J 1 4 


t < 2 . 


It follows that 


R(i) = l-F T (t) 
[and R(t ) = 0 if t > 2]. 


1 2 

1 — — if 0 < t < 1 , 

t 2 

2 + — — 2 £ifl <£<2 


Example 5.2.5. Suppose that, in the previous example, T\ ^ Exp(Ai) and X 2 
Exp(A 2 ), where Ai 7 ^ A 2 . Then, using (4.8), we can write that 


/ oo 

/txH/tsO - u)du 

-OO 

= [ X 1 e- Xiu \ 2 e- X ^ t - U ' l du = e- X2t [ AiA 2 e~ {Xl ~ X2)u du 

Jo Jo 


A,£A 2 A 1 A 2 , A2t _ e _ Alt) for t > Q 

Ai — A2 
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It follows that 


R(t) = j fr(s)ds = ^ 

\ie~ X2t - X 2 e~ Xlt 


g A 2 1 g Ai t 


^2 

for £ > 0. 


Ai 


Ai — A 2 

Note that, making use of l’Hospital’s rule, we obtain: 

0 —^ 2 1( 


im fr(t) = A^ lim -—- = A^te Alt for t> 0. 

—s-Ai A 2 —»Ai (J — 1 


lim 

A 2 

That is, T ~ G(a = 2, Ai), as should be. We also have: 

L'Hos. 


lim R(t)^ = a - lim Aie ^ % - — =e~ x '\X 1 t + l) for t > 0. 


a 2 ^a 


A2^Ai 


0-1 


5.2.3 Other cases 

Suppose that a system is made up of n subsystems and that at least k working sub¬ 
systems are needed for the system to operate, where 0 < k < n. This is called a 
k-out-of-n system. Note that a series system is the particular case when k = n, whereas 
a parallel system (with active redundancy) corresponds to the case when k = 1. 

In general, we cannot give a simple formula for the reliability function R(t) of the 
system. However, if all the subsystems are independent and have the same reliability 
function Ri(t), then the function R(t) is given by 

R(t) = P[N > fc], where N ~ B(n,p = Ri(t)). 


That is, 


R{t) = p [Rimi - Ri (tr = i - E (") [^i(*)]*[i - mr- 

Remark. We assume that the subsystems operate independently of one another. In 
practice, the lifetimes of the working components often depend on the total number of 
active components. For example, suppose that an airplane has four engines, but that it 
can fly and land with only two of them. If two engines fail while the airplane is flying, 
more load will be put on the two remaining engines, so that their lifetimes are likely to 
be shorter. 

Example 5.2.6. Consider a 2-out-of-3 system for which the lifetime of subsystem i 
follows an exponential distribution with parameter A i = i, for i = 1, 2,3. To obtain the 
reliability function of the system, we use the following formula: if A i, A 2 , and are 
independent events, then 
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P[(Ai n A 2 ) U (Ai D As) U (A 2 n As)] 

— p[Ai n A 2 ] + P[Ai n As] p[A 2 n A 3 ] — 3 P[Ai n A 2 n As] 

+ p[A 1 nA 2 nAs] 

=' P[Ai]P[A 2 ] + P[^i]P[A 3 ] + P[A 2 ]P[A 3 } - 2P[A 1 ]P[A 2 ]P[A 3 }. 
Let Ai = {Ti > £}, so that 

P[Ai] = P[Ti >t\= e~ u for i = 1 , 2 , 3. 

Then, we may write that 

R(t) = e~ 3t + e~ 4t + e~ 5t - 2e~ 6t for t > 0 . 


Remark. We can also write that 


R(t) = P[A 1 nA 2 nA , 3 ] + P[A 1 nA , 2 nAs]+P[A' 1 nA2nAs] 

+ P[A\ n -A2 bl A3] 

*=• P[A a ]P[A 2 ](l - P[Aa]) + P^Kl - P[A 2 ])P[A 3 ] 

+ (1 - P[Ai])P[A 2 ]P[^ 3 ] + P[A!]P[A 2 ]P[^3] 

= P[. 4 i]P[^ 2 ] + PiA^PlAs] + P[A 2 ]P[A 3 } - 2 P[A 1 ]P[A 2 }P[A 3 ], 


as above. That is, we decompose the event {T > t} into four incompatible cases: exactly 
two subsystems operate at time £, or the three subsystems operate. 


Next, the system shown in Figure 5.2 is called a bridge system. It operates at time 
t if and only if at least one of the following events occurs: 


• A1 

• A 2 

• A3 

• A4 


“components nos. 1 and 4 are active at time tf 
“components nos. 2 and 5 are active at time tf 
“components nos. 1, 3, and 5 are active at time tf 
“components nos. 2, 3, and 4 are active at time t.” 


Because the events Ai,..., A 4 are neither independent nor incompatible, we need the 
formula for the probability of the union of four arbitrary events: 


P[Ai U A 2 U A 3 U A 4 ] = P[Ai] + P[A 2 ] + P[As] + P[A 4 ] - P[A X n A 2 ] 

- p[Ai n As] - p[Ai n a 4 ] - P[A 2 n a 3 ] - p[a 2 n a 4 ] - p[a 3 n a 4 ] 

+ P[Ai n a 2 n A 3 ] + p[Ai n a 2 n A 4 ] + P[Ai n A 3 n A 4 ] 

+ p[A 2 n As n a 4 ] - P[Ai n a 2 n a 3 n a 4 ]. ( 5 . 1 ) 


In the special case when the five components in the bridge system operate independently 
and all have the same reliability function Pi(£), we can easily calculate the reliability 
function of the system. 
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Fig. 5.2. A bridge system. 


Finally, as we did in Chapter 2, we can consider systems made up of a number of 
subsystems connected in series and others connected in parallel. 

Example 5.2.7. A system is constituted of two subsystems placed in series. The first 
subsystem comprises two components connected in parallel, and the second subsystem 
contains a single component. Suppose that the three components operate independently. 
Let Ri(t) be the reliability function of component i, for i = 1, 2, 3. Then, the reliability 
function of the system is given by 

R(t) = {1 - [1 - Ri(t)][l - R 2 (t)]}R 3 (t) = [Ri(t) + R 2 (t) - R\(t')R 2 (t)] R 3 (t). 

That is, we make use of the formulas for both series and parallel systems at the same 
time. 


5.3 Paths and cuts 

When a system consists of a large (enough) number of components, a first task in 
reliability theory is to find the various sets of active components that will enable the 
system to operate. These sets are called paths. Conversely, we can try to determine 
the sets of components, called cuts , which, when the components they comprise all are 
down, entail the failure of the whole system. 

Let Xi be the Bernoulli random variable that represents the state of component i 
at a fixed time instant to > 0. More precisely, Xi = 1 (resp., Xi = 0) if component i 
is active (resp., down) at time to, for i = 1,..., n. The random variable Xi is actually 
the indicator variable of the event “component i is active at time to-” To calculate 
the reliability of the system at time to, we need the value of the probability pi := 
P[Xi\, for i = 1,..., n. However, to determine the paths and cuts of a system, we only 
have to consider the particular values Xi taken by the corresponding random variables. 
Furthermore, we assume that the fact that the system operates or not at time to depends 
only on the state of its components at that time instant. 
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Definition 5.3.1. The function 


H(xi, ...,x n ) 


1 if the system operates, 
0 if the system is down 


is called the structure function of the system. The vector x := (x\,... ,x n ) is the 
state vector of the system. 


Remarks, (i) The function H(X) = H(X i,..., X n ) is itself a Bernoulli random variable. 

(ii) Let 0 = (0,... ,0) and 1 = (1,..., 1). We assume that H(0) = 0 and H{ 1) = 1. 
That is, if all the components of the system have failed, then the system is down and, 
conversely, if all the components are active, then the system is operating. 

Notation. Consider two n-dimensional vectors: x = (xi,... ,x n ) and y = (y i,... ,y n )- 
We write: 

x > y if Xi > yi for i = 1,..., n 


and 


x > y if Xi > yi for i = 1,_, n and Xi > yi for at least one i. 


Definition 5.3.2. A structure function iif(x) such that H( 0) = 0, H(T) = 1 , and 


H(x)>H( y) */ x > y 


(5.2) 


is said to be monotonic. 

In this book, we assume that the structure functions H (x) of the systems considered 
are monotonic. 

Example 5.3.1. In the case of a series system (made up of n components), we have: 
iL(x) = 1 Xi = 1 Vi Xi = » 


whereas if the n components are connected in parallel, then 

n 

iif(x) = 1 <^> — 1- 

i= 1 


In general, for a k-out-of-n system, we can write that 


H{x i, ...,x n 



1 if E?=i 

oifEILi 


Xi > 
Xi < 


k, 

k. 
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Remarks, (i) We can also express the structure function if(x) as follows: 

, \ — j m i n { x i> • • •, x n } for a series system, 

(xi,..., x n ) | max{xi,..., x n } for a parallel system. 

(ii) In the present section, when we write that the components are connected in par¬ 
allel, we assume that they all operate from the initial time. That is, they are in active 
redundancy. 

To calculate the value of the structure function of an arbitrary system, the following 
formulas are useful: 

n 

min{xi = RiCi 

i= 1 

and 

n 

max{ii,.. .,x n } = 1 - JJ(1 - Xi), 

i=1 

which are valid when Xi = 0 or 1, for i = 1,..., n. 

Definition 5.3.3. A path vector is any vector x for which iJ(x) = 1. If, besides, 
H(y) = 0 for all vectors y such that y < x, then x is called a minimal path vector. 
Moreover, with every minimal path vector x = (xi,... ,x n ) we associate a set MP := 
{k E {1,..., n}: Xfc = 1} called a minimal path set. 

Definition 5.3.4. If Hfx) = 0, the state vector x is said to be a cut vector. If, in 
addition, H( y) = 1 when y > x, then x is a minimal cut vector. Furthermore, the 
set MC := {k G {1,..., n}: x^ = 0}, where x = (xi,..., x n ) is a minimal cut vector, is 
called a minimal cut set. 

Remarks, (i) In some books, the definition of a path (resp., cut ) (set) corresponds to 
that of a minimal path set (resp., minimal cut set ) here. 

(ii) A minimal path set is a group of components such that when they are all active the 
system operates, but if at least one of this group of components fails, then the system 
too fails. Conversely, if all the components in a minimal cut set are down, the system 
is down as well, but if at least one component of the minimal cut set is replaced by an 
active component, then the system will operate. 

Example 5.3.2. A series system made up of n components has a single minimal path 
set, namely the set MP = {1,2,... ,n} (because all components must operate for the 
system to function). It has n minimal cut sets, which are all the sets containing exactly 
one component: {1}, ..., {n}. Note that when we write that MC = {1}, it implies 
that components 2,..., n are active. Moreover, the state vector (0,0,1,1,..., 1) is a cut 
vector, but not a minimal cut vector, because if we replace only component no. 1 by an 
active component, the system will remain down. 
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Conversely, in the case of a parallel system comprising n components, the minimal 
path sets are {1}, ..., {n}, whereas there is only one minimal cut set: {1, 2,..., n}. 

Example 5.3.3. We can generalize the results of the previous example as follows: in a 
k-out-of-n system, there are (^) minimal path sets. That is, we can choose any set of k 
components among the n. The number of minimal cut sets is given by ( n _^ +1 ). Indeed, 
if exactly n — k + 1 components are down, then the system will resume operating if one 
of them is replaced by an active component. 


Example 5.3.4. The bridge system in Figure 5.2 has four minimal path sets, as indi¬ 
rectly mentioned above: {1,4}, {2,5}, {1,3,5}, and {2,3,4}. It also has four minimal 
cut sets: {1, 2}, {4, 5}, {1, 3, 5}, and {2, 3,4}. Notice that {1, 3, 5} and {2, 3,4} are both 
minimal path and minimal cut sets. 

Now, suppose that an arbitrary system has r minimal path sets. Let 

7Tj(xi,...,x n )= P Xi for j = 1,... ,r. 
ieMPj 


That is, 7Tj(xij ..., x n ) — 1 if all the components in the minimal path set MPj function, 
and 7Tj(xi ,..., x n ) = 0 otherwise. Because a system operates if and only if all the 
components in at least one of its minimal path sets are active, we can represent the 
structure function of the system in question as follows: 

r 

h ( x )= 1 - nn-vtx)]. 

3 = 1 

This formula implies that a given system can be considered as being equivalent to the 
one obtained by connecting its minimal path sets in parallel. 

Likewise, if an arbitrary system has s minimal cut sets, we can write that 

H(xi,...,x n ) = l- 

m= 1 


where 


7m(xi, ■ ■ ■ ,x n ) := 1 - P (1 -Xi) for m = 1,... ,s. 
ieMCm 


We have that 7 m (xi,..., x n ) is equal to 0 if all the components in the minimal cut set 
MCm are down, and to 1 otherwise. This time, we can say that a given system and the 
one made up of its minimal cut sets connected in series are equivalent. 


Example 5.3.5. From the preceding example, we deduce th at th e bridge system 
Figure 5.2 is equivalent to either the system depicted in Figure 5.3 or that in Figure l 


_ul 


.4. 


Because iJ(X), where X := (X 1 ,... ,X n ), is a Bernoulli random variable, the relia¬ 
bility of the system, at the fixed time instant to, is given by 
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I —EMD-1 

_—[1HH-_ 

ED {1} {1} 

Fig. 5.3. A bridge system represented as a parallel system made up of its minimal path sets. 



Fig. 5.4. A bridge system represented as a series system made up of its minimal cut sets. 


R(t 0 ) = P[H(X) = l}=E[H(X)}. 


If we let 


Pi = P[Xi = 1] (at time to) for i = 1,..., n 


and if we assume that the components operate independently of one another, then we 
can write that 


R(to ) 


niLiP* for a series system, 

1 — nlLi(l — Pi) f° r a parallel system. 


Moreover, if Pi = p for all z, then 


R(t 0 ) = E(”)p i (i-p) n - i 

i=k v 7 

in the case of a k-out-of-n system. These formulas are simply particular cases of the 
corresponding ones in Section 5.2. 

When a system comprises many components, we can at least try to obtain bounds 
for its reliability R(to) at time to. It can be shown that 

s r 

n ^bm(x)=1] < R(to) < 1 - n i 1 - =in, 

m= 1 j=l 
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where 

p[ 7 m (x)=i]=i - n (i - Pi ) 

ieMC rn 

and 

p[ 7r i (x)=i]= n ^ 

ieMPj 


Example 5.3.6. Suppose that pi = 0.9 for the bridge system in Figure 5.2. Making use 
of Equation (5.1), we find (assuming that the components are independent) that 

R(t 0 ) = 2(0.9) 2 + 2(0.9) 3 - 5(0.9) 4 + 2(0.9) 5 = 0.97848. 

Indeed, we have that P[Ai\ = (0.9) 2 , for i = 1,2, and P[Ai\ = (0.9) 3 , for i = 3,4. 
Moreover, the probability of any intersection in (5.1) is equal to (0.9)where k is the 
number of distinct components involved in the intersection in question. For example, 
A\ fl .A 2 occurs if and only if components 1, 2, 4, and 5 are active, so that P[A\ D Aq\ = 
(0.9) 4 . 

Now, we deduce from Example 5.3.4 that the lower bound for the probability R(to) 
is given by 

[1 - (O.l) 2 ] 2 [1 - (0.1) 3 ] 2 ~ 0.97814 

and the upper bound is 

1 - | [1 - (0.9) 2 ] 2 [1 - (0.9) 3 ] 2 1 ~ 0.99735. 

Notice that in this example the lower bound, in particular, is very precise. 


5.4 Exercises for Chapter 5 


Solved exercises 


Question no. 1 

Suppose that the lifetime X of a certain system can be expressed as X = Y 2 , where 
Y is a random variable uniformly distributed on the interval (0,1). Find the reliability 
function of the system. 

Question no. 2 

The time to failure (in years) for a given device is a random variable X having an 
exponential distribution with parameter A = 1/2. When the device fails, it is repaired. 
The repair time Y (in days) is a random variable such that 
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P[Y > y] = | 


e y if X < 2 and y > 0, 
e ~ y / 2 if X > 2 and y > 0. 


Calculate the quantity MTBF for this device. 

Question no. 3 

Let T be a continuous random variable having the following probability density 
function: 



[and fr(t) = 0 if t < 0], where A is a positive parameter. We say that T has a (particular) 
extreme value distribution. Calculate the failure rate function of a device whose lifetime 
is distributed as X. 

Question no. 4 

Suppose that the lifetime (in cycles) of a system is a Poisson random variable with 
parameter A > 0. Is the failure rate function r(k) increasing or decreasing at k = 0? 

Question no. 5 

If a system has a lifetime X that is uniformly distributed on the interval [0,1], what 
is its average failure rate over the interval [0,1/2]? 

Question no. 6 

A system comprises two (independent) components connected in series. The lifetime 
Xk of component k has an exponential distribution with parameter A&, for k = 1, 2. Use 
the formula [see (4.5)] 



which is valid for arbitrary continuous random variables X\ and X<i, to calculate the 
probability that the first breakdown of the system will be caused by a failure of com¬ 
ponent no. 2. 

Question no. 7 

We consider a system made up of two components connected in parallel and oper¬ 
ating independently of each other. Let T be the total lifetime of the system, and let X& 
be the lifetime of component fc, for k = 1,2. Suppose that X& ^ Exp(Afc). What is the 
probability that both components are still active at time to > 0, given that the system 
operates at that time instant? 

Question no. 8 

We have two identical brand A components and two identical brand B components 
at our disposal. To build a certain device, we must connect a brand A and a brand B 
component in series. Suppose that the reliability of each component is equal to 0.9 (at 
the initial time) and that they all operate independently of one another. Is it better 
to build two distinct devices and hope that at least one of them will work, or to build 
a device made up of two subsystems connected in series, with the first (resp., second) 
subsystem comprising the two brand A (resp., B) components placed in parallel? 
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Question no. 9 

Calculate the structure function of the system represented in Figure 2.14, p. 46, in 
terms of the indicator variables Xk, for k = 1,..., 4. 

Question no. 10 

Find the minimal path sets of the system represented in Figure 2.14, p. 46, and ex¬ 
press the structure function H{x i, £ 2 , X 3 , £ 4 ) in terms of the functions 7 Tj(xi,X 2 , £ 3 , 0 C 4 ). 


Exercises 


Question no. 1 

We want to build a system made up of two components connected in parallel, followed 
by a component connected in series. Suppose that we have three components at our 
disposal and that any arrangement of the components inside the system is admissible. 
If the reliability of component no. fc, at a fixed time instant to > 0, is equal to for 
k = 1,2,3, and if 0 < p\ < P 2 < P 3 < 1, what arrangement of the three components 
gives the largest probability that the system is active at time to? 

Question no. 2 

The lifetime X of a certain machine has the following probability density function: 



1 /x if 1 < x < e, 


0 elsewhere. 


Calculate the failure rate function r(x), for 1 < x < e. Is the distribution of X an IFR 


or DFR distribution? Justify. 

Question no. 3 

Suppose that the failure rate function r(t) of a given system is r(t) = 1 Vt > 0. 
Find the probability that the system will fail in the interval [2,3], given that it still 
operates at time t = 1. 

Question no. 4 

A device has a lifetime T that follows an exponential distribution with parameter 
A, where A is a random variable uniformly distributed over the interval [1,3]. Calculate 
the reliability function of the device. 

Question no. 5 


Let 



where T denotes the lifetime of a certain system. Calculate the interval failure rate of 
this system in the interval (0,1]. 
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Question no. 6 

Suppose that the lifetime T of a system has a G(4,1) distribution. That is, 

fr{t) = ^ 3 e _t for t>0. 

Calculate the reliability function of the system in question at time t = 4, given that it 
is still active at time t = 2. 

Question no. 7 

Calculate the failure rate function at time t = 1 of a system whose lifetime T is 
distributed as |Z|, where Z ~ N(0,1). 

Question no. 8 

Let 

Px(k) = ^ for fc=»l,2, ...,N 

be the probability mass function of the lifetime X (in cycles) of a particular system. 
Calculate rx{k ), for k = 1,..., N. Does X have an IFR or a DFR distribution? Justify. 

Question no. 9 

Consider the interval failure rate of a given system in the interval (n, n + 1], for 
n = 0,1,..., 9. Calculate the average interval failure rate of this system if T ~ U[0,10]. 

Question no. 10 

Assume that T has a Weibull distribution with parameters A > 0 and (3 > 0. What 
is the average failure rate in the interval [0, r] if r is a random variable distributed as 
the square root of a U(0,1) distribution and (a) /? = 2? (b) /3 = 3? 

Question no. 11 

Three independent components are connected in parallel. Suppose that the lifetime 
of component no. k has an exponential distribution with parameter A&, for k = 1,2,3. 
What is the probability that component no. 3 will not be the first one to fail? 

Question no. 12 

A system comprises two components that operate independently of each other, both 
from the initial time. Suppose that the lifetime (in cycles) of component no. k has a 
geometric distribution with parameter pk = 1/2, for k = 1, 2. Find the probability that 
both components will fail during the same cycle. 

Question no. 13 

Three independent components are connected in series. Suppose that the lifetime 
Tk of the kth component has a uniform distribution on the interval (0, k + 1), for 
A: = 1,2,3. What is the value of the reliability function of the system (made up of these 
three components) at time t = 1, given that at least one of the three components is not 
down at time t — 1? 
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Question no. 14 

Consider the system represented in Figure 2.13, p. 45. Suppose that the components 
A have a lifetime that follows an exponential distribution with parameter A^, and the 
lifetime of component B (resp., C) is exponentially distributed with parameter A b (resp., 
Ac). Calculate the reliability function of the system, assuming that the components 
operate independently of one another. 

Question no. 15 

A system consists of two components operating independently of each other and 
connected in parallel. Assume that the lifetime X^ of the kth component is exponentially 
distributed with parameter &, for k = 1,2. What is the probability that the system is 
still operating at time t = 2, given that exactly one of its components is down at time 
t = 1? 

Question no. 16 

We have four independent components having an exponentially distributed life¬ 
time at our disposal. The expected lifetime of the kth component is equal to 1/fc, for 
k = 1,2, 3,4. The components are used to build a system made up of two subsystems 
connected in series. Each subsystem comprises two components connected in parallel. 
What is the expected lifetime of the system if the first subsystem comprises the first 
and the second component (see Figure {5.5)^ 



Fig. 5.5. Figure for Exercise no. 16. 


Question no. 17 

Let the components of the system in Figure 2.13, p. 45, be numbered as follows: 
component (7 = 1, components A = 2, 3, and 4, and component B = 5. Find the 
minimal path sets and minimal cut sets of this system. 

Question no. 18 

Suppose that the reliability of the (independent) components in the preceding ques¬ 
tion is as in (unsolved) Exercise no. 8 of Chapter 2 (at a fixed time instant to >0). Use 
the minimal path sets and minimal cut sets found in the previous question to calculate 
the lower and upper bounds for the reliability of the system. Compare with the exact 


answer. 
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Question no. 19 

A certain system made up of four independent components operates if and only if 
component no. 1 is active and at least two of the other three components operate. What 
are the minimal path sets and the minimal cut sets of this system? 

Question no. 20 

In the preceding question, (a) what is the probability that the system operates at 
time to if the reliability of each component is equal to p at to? (b) What is the reliability 
function of the system at time t if the lifetime of each component is an Exp($) random 
variable? 


Multiple choice questions 


Question no. 1 

Suppose that X ~ Exp(l) and Y ~ U(0,1) are independent random variables. We 
define Z = min{X, Y}. Find the failure rate function rz(t) of Z for 0 < t < 1. 


(a) 1 (b) (0^ (d)^ ( e ) l~^~t 


Question no. 2 

The lifetime T of a device has a lognormal distribution with parameters p = 10 and 
a 2 = 4. That is, InT ~ N(10,4). Find the reliability of the device at time t = 200. 

(a) 0.01 (b) 0.05 (c) 0.5 (d) 0.95 (e) 0.99 

Question no. 3 

For what values of the parameter cu is a beta distribution with parameters a (> 0) 
and (3 = 1 an IFR distribution everywhere in the interval (0,1)? 

(a) any a > 0 (b) none (c) a > 1 (d) a < 1 (e) a > 2 

Question no. 4 

Suppose that the lifetime T of a system is uniformly distributed on the interval 
(0, B), where B is a random variable having an exponential distribution with parameter 
A > 0. Find the reliability function of the system. 


(a) — — t (b) A — t (c) 1 — y (d) 1 — At (e) A(1 — t) 
A A 


Question no. 5 

Calculate the average failure rate over the interval [0, 2] of a system whose lifetime 
T has the following probability density function: 



where A is a positive parameter. 
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( a ) ~""2~ ( b )^ ( d ) A ( e ) 2A 


Question no. 6 

Two independent components are connected in series. The lifetime T& of component 
no. k has an exponential distribution with parameter A&, for k = 1,2. When a component 
fails, it is repaired. Suppose that the lifetime T£ of a repaired component is twice shorter, 
on average, than that of a new component, so that T£ ~ Exp(2A/~), for k = 1, 2. Find the 
probability that component no. 1 will cause the first two failures of the series system. 


(a) 

(d) 


A? 


(Ai + A 2 ) 2 

A? 


(b) 


A? 


(c) 


A? 


(2Ai + A 2 )(Ai + A 2 ) 


(Ai + 2A 2 ) 2 v j (Ai + 2A 2 )(Ai + A 2 ) 

(e) A? 


(2Ai + A 2 ) 2 


Question no. 7 

We consider a system made up of two independent components placed in standby 
redundancy. The lifetime T\ of component no. 1 has a uniform distribution on the 
interval (0,2), whereas the lifetime T 2 of component no. 2 is exponentially distributed 
with parameter A = 2. Moreover, suppose that component no. 2 relieves the first one as 
soon as it fails if T\ < 1, or at time t = 1 if component no. 1 is still active at that time 
instant. What is the average lifetime of the system? 

(a) 9/4 (b) 5/2 (c) 11/4 (d) 3 (e) 13/4 


Question no. 8 

A certain 15-out-of-20 system is such that all the (independent) components have a 
probability equal to 3/4 of being active at time to > 0. 

(i) Calculate the probability p that the system operates at t = to- 

(ii) Use a Poisson approximation to calculate the probability p in (i). 

(a) (i) 0.4148; (ii) 0.4405 (b) (i) 0.6172; (ii) 0.5343 
(c) (i) 0.6172; (ii) 0.6160 (d) (i) 0.7858; (ii) 0.6380 
(e) (i) 0.7858; (ii) 0.7622 


Question no. 9 

Consider the system in Figure 5.6. How many (i) minimal path sets and (ii) minimal 
cut sets are there? 


(a) (i) 4; (ii) 4 (b) (i) 5; (ii) 4 (c) (i) 5; (ii) 5 (d) (i) 6; (ii) 4 

( e ) (i) 6; (ii) 5 


Question no. 10 

Suppose that, in the preceding question, the components operate independently of 
one another and all have a 3/4 probability of being active at time to > 0. Give a lower 
bound for the reliability of the system at time to- 

(a) 0.1279 (b) 0.3164 (c) 0.6836 (d) 0.8721 (e) 0.8789 
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Fig. 5.6. Figure for multiple choice question no. 9. 
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Queueing 


An important application of probability theory is the field known as queueing theory. 
This field studies the behavior of waiting lines or queues. Telecommunication engineers 
and computer scientists are particularly interested in queueing theory to solve problems 
concerned with the efficient allocation and use of resources in wireless and computer 
networks, for instance. In general, the models considered in this chapter are such that 
the arrivals in the queueing system and the departures from this system both constitute 
Poisson processes, which were defined in Chapter 3 (p. 69). Poisson processes are actually 
particular continuous-time Markov chains , which are the subject of the first section of 
the present chapter. Next, the case when there is a single server in the queueing system 
and that when the number of servers is greater than or equal to two is studied separately. 


6.1 Continuous-time Markov chains 

Definition 6.1.1. A stochastic process (or random process^) is a set {X(t),t G T} 
of random variables X(t), where T is a subset o/M. 

Remarks, (i) The deterministic variable t is often interpreted as time in the problems 
considered. In this chapter, we are interested in continuous-time stochastic processes, 
so that the set T is generally the interval [0, oo). 

(ii) The set of all possible values of the random variables X (t) is called the state space 
of the stochastic process. 

A very important class of stochastic processes for the applications is the class of 
those known as Markov processes. 
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Definition 6.1.2. If we can write that 

P[X(t n ) < x n | X(t),\/t < t n - 1 ] = P[X(t n ) < x n | X(t n - 1 )], (6.1) 

where t n -\ <t n , we say that the stochastic process {X(t),t G T} is a Markov process 
(or Markovian process ). 

Remark. The preceding equation, known as the Markov property , means that the future 
of the process depends only on its present state. That is, assuming that the set T is 
the interval [0, oc), the history of the process in the interval [0, £ n _i) is not needed to 
calculate the distribution of the random variable X(t n ), where t n > £ n _i, if the value 
of X(t n - 1 ) is known. 

Definition 6.1.3. If the possible values taken by the various random variables X(t) are 
assumed to be at most countably infinite, so that X(t) is a discrete random variable for 
any fixed value of the variable t, then we say that {X(t),t G T} is a discrete-state 
stochastic process. 

Now, let Ti be the time that the continuous-time and discrete-state Markovian pro¬ 
cess {X(t),t > 0} spends in a given state i before making a transition to any other 
state. We deduce from the Markov property that 

P[Ti > S +t | Ti > s\ = P[Ti > t\ Vs, t > 0 (6.2) 

(otherwise the future would depend on the past). This equation implies that the con¬ 
tinuous random variable t* is exponentially distributed. Indeed, only the exponential 
distribution possesses this memoryless property (see p. 76) in continuous time. 

Remarks, (i) We denote the parameter of the random variable by z^, for any i. In 
the general case, vi depends on the corresponding state i. However, in the case of the 
Poisson process with rate A, we have that ^ = A for all i. 

(ii) We also deduce from the Markov property that the state that will be visited when 
the process leaves its current state i must be independent of the total time Ti that the 
process spent in i before making a transition. 

Definition 6.1.4. The continuous-time and discrete-state stochastic process {X(t),t > 
0} is called a continuous-time Markov chain if 

P[X(t) = j | X(s) = i, X(r) =x r ,0 <r<s}= P[X{t) = j \ X(s) = i] 

for all t > s and for all states i,j , x r . 

Remarks, (i) We assume that the Markov chains considered have time-homogeneous 
transition probabilities. That is, if t > s > 0 and r > 0, we may write that 
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P[X(t) = j | X(s) = i] := PiJ (t - s) ^ P[X(t + t)=j | X{t) = i] = 


That is, the probability that the process moves from state i to state j in a given time 
interval depends only on the length of this time interval. This assumption is made in 
most textbooks and is realistic in many applications. The function Pij(t) is known as 
the transition function of the continuous-time Markov chain. 

(ii) If Pij(t) > 0 for some t > 0 and Pj,i(t*) > 0 for some t* > 0, we say that states i 
and j communicate. If all states communicate, the chain is said to be irreducible. 

(iii) In the context of queueing theory, the X(t) s are nonnegative integer-valued random 
variables. That is, the state space of the stochastic process {X(t),t > 0} is the set 
{0,1,...}. Under this assumption, we can write that 


oo 


= 1 Vi e 


j =o 


Indeed, whatever the state of the process at a fixed time r > 0 is, it must be in some 
state at time r +1, where t > 0. Note that we have: 



for all states i,j E {0,1,...}. 

Notation. We denote by pij the probability that the continuous-time Markov chain 
{X(t),t > 0}, when it leaves its current state i, goes to state j, for i, j E {0,1,...}. 

We have, by definition, pi^ = 0 for all states i and 


oo 


Yi f J i-j = 1 V* e {o,i,...}. 


Definition 6.1.5. Let {X(t),t > 0 } be a continuous-time Markov chain with state space 



(6.3) 


Pi,i+ 1 = 1 for all i, 

then {X(t),t > 0} is a pure birth process, whereas in the case when 

Pi,i- i = l for alii e{ 1,2,...}, 
we say that {X(t),t > 0} is a pure death process. 
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We deduce from the definition that a birth and death process is such that 


Po,i = 1 and p iii+1 + Pi,i_i,-*= 1 for i E {1,2,...}. 


That is, when the process is in state i > 1, the next state visited will necessarily be i +1 
or i — 1. 

Remark. The state space of the birth and death process can be a finite set {0,1, 2,..., c}. 
Then, we have that p c , c -1 = 1- 

In queueing theory, the state of the process at a fixed time instant will generally be 
the number of individuals in the queueing system at that time. When {X(t),t > 0} goes 
from state i to i + 1 , we say that an arrival occurred, and if it moves from i to i — 1 , 
then a departure took place. We assume that, when the chain is in state i, the time Ai 
needed for a new arrival to occur is a random variable having an Exp(A^) distribution, 
for i E {0,1,...}. Furthermore, Ai is assumed to be independent of the random time 
Di rsj Exp (fii) until the next departure, for i E {1,2,...}. 

Proposition 6.1.1. The total time Ti that the birth and death process {X(t), t > 0} 
spends in state i, on a given visit to that state, is an exponentially distributed random 
variable with parameter 



Proof. When i = 0, we simply have that To = Aq, so that To ~ Exp(Ao). For i = 1,2,..., 
we can write that ti = min {Ai,Di}. The result then follows from Proposition 5.2.1. ■ 

Remark. We also have [see (4.6)] that 



and 



Definition 6.1.6. The parameters fori = 0,1, ..., are called the birth (or arrival} 
rates of the birth and death process {X(t),t > 0}, whereas the parameters pi, for 
i = 1 , 2 ,..., are the death (or departure} rates of the process. 

Example 6.1.1. In addition to being a continuous-time Markov chain, the Poisson 
process {N(t),t > 0} is a particular counting process. That is, N(t) denotes the total 
number of events in the interval [0, t]. Because only the number of events is recorded, 
and not whether these events were arrivals or departures, {N(t),t > 0} is an example 
of a pure birth process. It follows that Pij(t ) = 0 if j < i. Furthermore, using the fact 
that the increments of the Poisson process are stationary, we can write that 
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Pi At) = P l N ( T + t)= 3 I N(t) =i]= P[N(t ) = j - i] 

= P[Poi(At) = j — i] = e~ xt for j > i > 0. 

The time ti that {N(t),t > 0} spends in any state i G {0,1,...} follows an expo¬ 
nential distribution with parameter A. Indeed, we have: 

P[tq > t\ = P[N(t) = 0] = e~ xt for t > 0, 

which implies that To ~ Exp(A). Next, because the Poisson process has independent and 
stationary increments, we can then assert that r* ^ Exp(A), for i = 1, 2,..., as well. 

In general, it is very difficult to calculate explicitly the transition function Pij(t). 
Therefore, we have to express the quantities of interest, such as the average number 
of customers in a given queueing system, in terms of the limiting probabilities of the 
stochastic process {X(t),t > 0}. 

Definition 6.1.7. Let {X(t),t > 0} be an irreducible continuous-time Markov chain. 
The quantity 

t Tj := lim pi j(t ) for all j G {0,1,...} 

t —XX) 

is called the limiting probability that the process will be in state j when it is in 

equilibrium. 

Remarks, (i) We assume that the limiting probabilities iTj exist and are independent of 
the initial state i. 

(ii) The 7Tj s also represent the proportion of time that the continuous-time Markov chain 
spends in state j, over a long period of time. 

It can be shown that the limiting probabilities ttj satisfy the following system of 
linear equations: 

KjVj = yy u viPij vj e {o, i,...}. (6.4) 

To obtain the t^s, we can solve the preceding system, under the condition 

oo 

Ev = l - ( 6 - 5 ) 

j =0 

Remarks, (i) The various equations in (6.4) are known as the balance equations of the 
stochastic process {X(t),t > 0}, because we can interpret them as follows: the departure 
rate from state j must be equal to the arrival rate to j, for all j. 

(n) if{x(t) , t > 0} is a birth and death process with state space {0,1,...}, the balance 
equations are: 
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state j departure rate from j = arrival rate to j 


0 A 0 7To = filTTi 

1 (Ai + M i)7Ti = /i 2 7T 2 + A 0 7To 

k (> 1) (Afc + /ifc)7Tfc = yU/c+iTT/c+i + A^—l7rfe_i 


The basic models in queueing theory are particular birth and death processes. For 
this class of processes, we can give the general solution of the balance equations. 


Theorem 6.1.1. If {X(t),t > 0} is an irreducible birth and death process with state 
space {0,1 ,..then the limiting probabilities are given by 


1 


for j = 0, 


i + Er=i Hk 

n^o for j = 1 , 2 ,..., 


( 6 . 6 ) 


where 

A 0 Ai•• • Afc-i 7 . 1 

U k := - for k > 1. 

hih2 • • '/ife 

Remark. The limiting probabilities exist if and only if the sum converges. In 

the case when the state space of {X(t),t > 0} is finite , the sum in question always 
converges, so that the existence of the limiting probabilities is guaranteed. 

Example 6.1.2. Suppose that the birth and death rates of the birth and death process 
{X(t),t > 0} with state space {0,1, 2} are given by 

A 0 = Ai = A and fii = /i, /i 2 = 2/i. 

Write the balance equations of the system and solve them to obtain the limiting prob¬ 
abilities. 

Solution. We have: 

state j departure rate from j = arrival rate to j 


0 Xno = fllTi 

1 (A + /i)7Ti = 2/i7T 2 + A7T 0 

2 2 // 7 r 2 = Xtti 

Because this system of equations is simple, we can solve it easily. We deduce from 
the equation for state 0 that 
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A 

7Ti = —7T 0 . 

Similarly, the equation for state 2 implies that 

«= h " = (£) 0) 

It follows that 

A / A \ /A\ 

7T 0 + -7T 0 +1 — 1 - 7T 0 = 1 

m V 2 /'7 Vm/ 

That is, 

so that 

and 

Remarks, (i) We can check that the equation for state 1, which we did not need to solve 
the system of linear equations, is also satisfied by the solution obtained above. 

(ii) Because (X(t), t > 0} is a particular birth and death process, we can also appeal to 
Theorem 6.1.1 to find the limiting probabilities. We have: 



ilx := 


A 

[i 


and 


A x A 
/ix2/i’ 


from which we retrieve the formulas for 7 Tq, 7Ti, and 7T2- 


6.2 Queueing systems with a single server 

Let X(t) designate the number of customers in a queueing system at time t. If we 
assume that the times A n between the arrivals of successive customers and the service 
times S n of customers are independent exponential random variables, then the process 
{X(t),t > 0} is a continuous-time Markov chain. Moreover, in most cases, we also 
assume that the customers arrive one at a time and are served one at a time. It follows 
that {X(t),t > 0} is a birth and death process. The arrivals of customers in the system 
constitute a Poisson process. It can be shown that the departures from the system in 
equilibrium constitute a Poisson process as well. Such a queueing system is denoted by 
M/M/s, where s is the number of servers in the system. In the present section, s is 
equal to 1. 
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Remarks, (i) We used the word customers above. However, customers in a queueing sys¬ 
tem may actually be machines in a repair shop, jobs in a computer system, or airplanes 
arriving or departing from an airport, among others. 

(ii) To be precise, we should specify that the random variables S n are independent of 
the A n s. Furthermore, the S n s are identically distributed random variables, and so are 
the A n s. 

(iii) The notation M for the arrival process (and the departure process) is used because 
the Poisson process is Markovian. 

We are interested in the average number of customers and the average time that 
an arbitrary customer spends in the queueing system, when it is in equilibrium or in 
stationary regime. 

Notations. We denote, respectively, by TV, Nq, and Ns the (total) average number 
of customers in the system in equilibrium, the average number of customers who are 
waiting in line, and the average number of customers being served. Moreover, T is the 
(total) average time that an arbitrary customer spends in the system, Q is the average 
waiting time of an arbitrary customer, and S is the average service time of an arbitrary 
customer. 

We have that N = Nq + Ns and T = Q + S. As mentioned in the previous section, 
we express the various quantities of interest in terms of the limiting probabilities 7r n of 
the stochastic process {X(t),t > 0}. 


Definition 6.2.1. Let N(t), for t > 0, be the number of arrivals in the system in the 
interval [0, t]. The quantity 


A a : = 


lim 

t—>oo 


N(t) 


(6.7) 


is called the average arrival rate of customers in the system. 


Remarks, (i) It can be shown that 

m =wV 

t—► oo t E[H n ] 

In our case, we assume that A n ~ Exp (A), for n = 0,1,..., so that the stochastic process 
{N(t),t > 0} is a Poisson process with rate A > 0. It follows that A a = A. 

(ii) When the system capacity is infinite, all the arriving customers can enter the system. 
However, in practice, the capacity of any system is finite. Therefore, we also consider the 
average entering rate of customers into the system, which is denoted by A e . In the case 
when the system capacity is equal to a constant c (< oo), we have that A e = A(1 — i r c ), 
because (1 — i r c ) is the (limiting) probability that an arriving customer will be allowed 
to enter the system. Note that, in fact, even if the system capacity is assumed to be 
infinite, some arriving customers may decide not to enter the system if they find that 
the queue length is too long, for instance. So, in general, A e is smaller than or equal 
to A. 
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(iii) Let D{t) denote the number of departures from the queueing system in the interval 
[0,£]. We assume that 


X d := lim = \ e . 

t — XX) t 


To analyze a given queueing system, we often start by computing its limiting prob¬ 
abilities 7 r n . Next, we try to obtain the quantities TV, TVq, and so on, in terms of the 
7r n s. Furthermore, we can use a cost equation to establish a relation between TV and T. 
Indeed, if we assume that an arbitrary customer pays $1 per time unit that she spends 
in the system (either waiting to be served or being served), then it can be shown that 

TV = A e • T. (6.8) 


This equation is known as Little’s formula (or Little’s law). It is valid if we assume that 
both A e and T exist and are finite. Moreover, we have: 

1 

TV = lim - / X(s)ds 

t^oo t J o 

and, if X^ denotes the time spent in the system by the kth customer, 


T = lim 

t—> oo 




N(t) 


Remarks, (i) Little’s formula holds for very general systems, in particular, for the 
M/M/s systems, with finite or infinite capacity, that are studied in this book. 

(ii) When t is large enough for the process to be in stationary regime, we may write 
that 


N = E[X(t)}. 


Similarly, we have: 

N s = \ e - S. (6.9) 

It follows, using the fact that TV = TVq + Ns and T ==* Q + 5, that 

Nq = A e • Q. 

In the case of the M/M/s model, A e = A and the service times S n are assumed to be 
i.i.d. exponentially distributed random variables with parameter /i. Hence, we deduce 
that S = E[S n \ = 1/fi. Equation (6.9) then implies that Ns = A//i. 


6.2.1 The M/M/1 model 

The most basic queueing system is the M/M/1 model. In this model, we assume that 
the successive arrivals of customers constitute a Poisson process with rate A and that 
the service times S n are independent Exp(/i) random variables. Furthermore, the S n s 
are independent of the interarrival times of customers. Finally, the system capacity is 
infinite and we take for granted that all arriving customers decide to enter the system, 
whatever the state of the system upon their arrival is. 
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The stochastic process {X(t),t > 0}, where X(t) denotes the number of customers 
in the system at time t > 0, is an irreducible birth and death process. Indeed, because 
the birth rates A n = A and the death rates p n = p are positive for any value of n, all 
states communicate. We find that the balance equations for the M/M/1 queue are (see 
p. 196): 

state j departure rate from j = arrival rate to j 


0 Xno = /i7Ti 

n (> 1) (A + /i)7T n = A7T n _i + /i7T n+ i 

We can solve the previous system of linear equations, under the condition 71 n = 1, 

to obtain the limiting probabilities. However, Theorem 6.1.1 gives us the solution almost 
at once. We calculate 


Ilk = 


AA • • • A 


for k = 1, 2,... . 


( 6 . 10 ) 


It follows that 


_ Y' 7T _ 


< oo if and only if p : 


k=l 


X -<i. 

p 


Remarks, (i) The quantity p is called the traffic intensity or the utilization rate of the 
system. Because 1/p is the average service time of an arbitrary customer and A is the 
average arrival rate of customers, the condition p < 1 means that the customers must 
not arrive more rapidly than the rate at which they are served or, equivalently, more 
rapidly than the average time it takes to serve one customer, if we want the system to 
reach a stationary (or steady-state ) regime. When p > 1, we can assert that the queue 
length will increase indefinitely. 


(ii) In Chapter 2, we used Venn diagrams to represent sample spaces and events. In 
queueing theory, we draw a state transition diagram to describe a given system. The 
possible states of the system are depicted by circles. To indicate that a transition from 
state i to state j is possible, we draw an arrow from the circle corresponding to state 
i to the one representing j. We als o wr ite above (or under) each arrow the rate of the 
transition in question (see Figure 6.1). Once the appropriate state transition diagram 
has been drawn, it is a simple matter to write the balance equations of the system. 


Next, we deduce from Theorem 6.1.1 that, if p < 1, 


TTO = 


1 + S* 


1 - (X/fi) 


-1 


=!--=!~P 
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M- M- M- 


Fig. 6.1. State transition diagram for the M/M/1 model. 


and 


That 


TT? = Ilj 7T0 = - 1-for j = 1, 2,... . 


7Tfc = p k (1 - p) Vfc > 0. 


( 6 . 11 ) 


Making use of the limiting probabilities and Formula (1.7) with a = 1 — p and r = p, 
we can write that 


V:=^fc7r, = ^fc/(l-p) 


fc =0 


fc =0 


(1 -p)p = p 
(1 - p) 2 1 -p 


( 6 . 12 ) 


Remarks, (i) Note that lim^i N = oc, which reflects the fact that the queue length 
grows indefinitely if p = 1 (and, a fortiori , if p > 1). 

(ii) If we let N denote the (random) number of customers in the system in equilibrium, 
so that N = E[N], we can write that := TV + 1 has a geometric distribution with 
parameter p := 1 — p. 

Now, we deduce from Little’s formula (6.8) that 


- _ N _ p 1 

= X ~ A(1 - p) ~ p- \' 


(6.13) 


Because S = 1/p and N$ = X/p = p, as already mentioned above, it follows that 


and 


Q = T — S 


1 1 


X 


p — X p p(p — A) 


Vq = N - Ns = rp- - p= 9 


1 - p 1 - p 


(6.14) 

(6.15) 


Remark. Let N$ be the random variable that denotes the number of customers being 
served, when the system is in stationary regime. Because there is only one server, Ns 
has a Bernoulli distribution with parameter po := 1 — 7Tq, which yields 


Ns = = Po = 1 - 7T 0 = 1 - (1 - p) = P, 


as stated above. 
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We have given above the exact distributions of the random variables S', TV, and 
Ns . We can also find the distributions of the variables T, Q, and Nq , where T is the 
total time that an arbitrary customer will spend in the system (in equilibrium), Q is 
his waiting time, and Nq is the number of customers waiting to be served. We already 
found that E[T] = T = l/(/i — A). Actually, T is exponentially distributed and the 
quantity /i — A is its parameter. 

Proposition 6.2.1. The total time T spent by an arbitrary customer in an M/M/1 
queue in equilibrium is an exponentially distributed random variable with parameter 
/I — A. 


Proof. To prove the result, we condition on the number K of customers already in the 
system upon the arrival of the customer of interest. We can write that 

oo 

P[T <t]=J2 p l T <t \ K = k]P[K = k], 

k =0 

Now, by the memoryless property of the exponential distribution, if K = fc, then 
the random variable T is the sum of k + 1 independent random variables, all having 
an Exp(/i) distribution. Indeed, the service time of the customer being served (if k > 
0), from the moment when the customer of interest enters the system, also has an 
exponential distribution with parameter /i. Using (4.9), we can write that 

T | {K = k}~G(k + l,/jL). 


Next, we can show that when the arrival process is a Poisson process, the probability 
P[K = k] that an arbitrary customer finds k customers in the system in equilibrium 
upon his arrival is equal to the limiting probability i r/~ that there are k customers in the 
system (in equilibrium). It follows that 


P[K = k] = 7Tfc = p k (l-p) for fc = 0,1,. 


Hence, we have: 


°° r r 

P[T<t] = Yi / 
k =0 u ° 


-ur 

k\ 


p=X/n 


^ e "■ Nr dT , 

-»x(A r_l 
k\ 


p k a-p) 


00 

(M-Xl'Z 

d 0 


dr 


k=0 1 
t 


pt 00 

(M-A) / e~^Y- 

k=0 


k\ 


dr 


= (M-A)/ 
Jo 


e~^ T e xT dr = 1 - 


which implies that 


f T {t) = j t P[T<t] = ^-X)e-^- x ^ for f > 0. ■ 
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The waiting time, Q, of an arbitrary customer entering the system is a random 
variable of mixed type. Using the fact that P[K = k] = 71 ^, where K is defined above, 
we have: 

P[Q = 0}=P[K = 0}=7r o = l-p. 

In the case when K = k > 0, we can write that Q ~ G(fc, p). Then, conditioning on 
all possible values of the random variable K , we obtain that 


P[Q < t\ = 1 - 


G) 


for t > 0. 


Remark. We find that Q \ {Q > 0} ~ Exp(/i — A). That is, R := Q \ {Q > 0} and the 
total time T spent by an arbitrary customer in the system are identically distributed 
random variables. Note that an exponential random variable being continuous, we may 
define it in the interval [0, 00 ) or (0, 00 ) indifferently. 

Finally, the number Nq of customers waiting in line when the system is in stationary 
regime can be expressed as follows: 


f 0 if N = 0, 

\ N - 1 if N = 1,2,... . 


Hence, we may write that 

P[Nq = 0] = P[N = 0] + P[N = 1] = ^0 + TTi = (1 + p)( 1 - p) 


and 

P[N q = k}=P[N = k+l}= 7T k+1 = p k+1 ( 1 -p) if k = 1,2,... . 

Remarks, (i) Because we obtained the distributions of all the random variables of inter¬ 
est, we could calculate their respective variances. 

(ii) The random variables Q and S are independent. However, Nq and Ns are not 
independent. Indeed, we may write that 


Ns = 0 Nq = 0 

(because if nobody is being served, then nobody is waiting in line either). 

Example 6.2.1. Suppose that at a fixed time instant to > 0, the number X(to) of 
customers in an M/M/1 queue in stationary regime is smaller than or equal to 3. 
Calculate the expected value of the random variable X(to), as well as its variance, if 
A = /i/2. 

Solution. Because p = 1/2, the limiting probabilities of the system are: 

7 T k =p k (l-p) = (l/2) fe+1 for fc = 0,1,2,... . 
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It follows that 


3 

p[x( to ) < 3 ] = J2(m k+1 = \ + \ + l + ^ 

k=0 


15 

16' 


Therefore, under the condition X(to) < 3, we have: 


7T0 


1/2 

15/16 



7T2 


2 

15 


and 7T3 


1 

15' 


It is then easy to obtain the mean and the variance of X(to). We calculate 


and 


so that 


E[X(t 0 )] *0- 


4 2 1 

lx-h2 x-b 3 x — 

15 15 15 


11 

15 


E[X 2 (t 0 )] = 0+1 2 x^ + 2 2 x^+3 2 xW 


21 

15’ 


VAR[X(to)] = § 


/11 \ 2 _ 194 
[l5j ~ 225' 


Remark. The fact that X(to) < 3 does not mean that the system capacity is c = 3. 
When c = 3, the random variable X(t) must necessarily be smaller than or equal to 
3 for all values of £, whereas in the previous example the number of customers in the 
system at a fixed time instant was smaller than 4. The case when the system capacity 
is finite is the subject of the next subsection. 

Example 6.2.2. Often a particular queueing model is a more or less simple transforma¬ 
tion of the basic M/M/1 queue. For example, suppose that the server in an otherwise 
M/M/1 queue always waits until there are (at least) two customers in the system be¬ 
fore beginning to serve them, exactly two at a time, at an exponential rate (i. Then, 
the stochastic process {X(t),t > 0}, where X(t) denotes the number of customers in 
the system at time t > 0, is still a continuous-time Markov chain. However, it is no 
longer a birth and death process. To obtain the limiting probabilities of the process, we 
must solve the appropriate balance equations, under the condition YlkLo^k = 1- These 
balance equations are (see Figure fl-2)| 
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X X X X 



n ^ H 


Fig. 6.2. State transition diagram for the queueing model in Example 6.2.2. 


state j departure rate from j 


arrival rate to j 


0 

1 

2 

k (> 3) 


A7T 0 

A7Ti 

(A + /i)7T 2 
(A + /i)7T fe 


( 0 ) 

= M7T2 

( 1 ) . , 

= A7r 0 + /i7T 3 

(2) \ _L_ 

= A7Ti + /i7T4 
= XlTk-l + /i>7Tfc+2 


A solution of the equation for fc > 3 can be obtained by assuming that = a k ~ 2 7r 2l 
for k = (2) ,3,4,..., where a is a constant such that 0 < a < 1. The equation in question 
becomes: 

(A + /a)a k ~ 2 7 t 2 = A a k ~ 3 7T 2 + na k n 2 . 

Given that 7 t 2 cannot be equal to zero, we can write that 

(A T fi)a = A T fio? . 

That is, we must solve a third-degree polynomial equation. We find that a = 1 is an 
obvious root. It follows that 

(A T /jl)cl = A T fio? 

Hence, the other two roots are: 


a = 


(a — 1 )(//a 2 + fia — A) = 0. 


1 ± a/m 2 + 4/uA 
2 ± 2 // 


Because a > 0, we deduce that 


1 +4^a 

2 + 2 m 


l(v/l + 4p-l). 


Remarks, (i) The solution a = 1 must be discarded, because it would imply that 7r^ = 7r 2 , 
for k = 2, 3,..., so that the condition n k = 1 could not be fulfilled. 
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(ii) We must also have: 

\ (x/l+4/O-l) <1 <*=► y/l +4/K3. 

Hence, we deduce that the limiting probabilities exist (if and) only if p < 2 or, equiva¬ 
lently, if A < 2/i. That is, the arrival rate of the customers must not exceed their service 
rate. This is the same condition for the existence of the limiting probabilities as the one 
in the case of the M/M/2 model, as shown in the next section. 

To complete this example, we use the equations for states 0 and 1 above to express 
7To and 7Ti in terms of 7r 2 - The equation (0) yields at once that 7To = (/x/A)7 T2, whereas 
(0) and (1) together imply that 


A7Ti = /i7T 2 + 111 r 3 = M7T 2 + HCL7T2 


7Ti = + 1 )tt 2 . 


Then, the condition = ^ enables us to obtain an explicit expression for 7T2 

(from which we deduce the value of 7r/~, for k = 0,1,3,4,...). We have: 


oo oo 

1 = ^2 7T k = + ^(a + 1 )tt 2 + 

fc=0 k=2 


= 7T2 


y(q + 2) + 

fe =0 


7T2 


^(a + 2) + 


1 

1 — a 


Thus, we can write that 


7T2 


y(a + 2) + -- 

A 1 — a 


i -i 


Observe that we did not make use of the equation for state 2 to determine the limiting 
probabilities i T&. Actually, there is always one redundant equation in the system of linear 
equations. We can now check that the solution obtained also satisfies the equation (2) 
above. We have: 


(A + /i) 7T 2 = A7Ti + /i7T 4 


(A + /i) 7T 2 = + 1)^2 + /ia 2 7r 2 . 


That is, we must have: 


fia 2 + fia — A = 0. 


But this is exactly the quadratic equation satisfied by the constant a (see above). 

Remarks, (i) We have obtained a solution of the balance equations, subject to the 
normalizing condition (6.5). Actually, it can be shown that there is a unique solution 
of this system of linear equations that satisfies (6.5). Therefore, we can assert that we 
have the solution to our problem. 
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(ii) If the server is able to serve two customers at the same time (also at rate /i), but 
begins to work as soon as there is one customer in the system, then the solution is 
slightly different (see [20]). 

(iii) If we suppose instead that the customers always arrive two at a time, but are served 
only one at a time, then the balance equations become: 

state j departure rate from j = arrival rate to j 


0 XlTo = flTTi 

1 (A + /i)7T i = li7T 2 

k (> 2) (A + ii)7T k = \n k -2 + MTT/c+1 

In such a case, we could determine at random the respective positions in the queue of 
the two customers who arrived together. 

(iv) Finally, if the customers always arrive two at a time and are also always served 
two at a time, then the limiting probabilities 7r* of the corresponding continuous-time 
Markov chain can be expressed in terms of the limiting probabilities n n of the M/M/1 
queue as follows: 

K = 7r «/2 = P n,2 { 1 - p) for n as 0,2, 4, ... . 

6.2.2 The M/M/1 model with finite capacity 

As mentioned previously, in practice the capacity of any queueing system is limited. Let 
c be the finite integer denoting this capacity. Suppose that we computed the limiting 
probabilities of a given queueing system having finite capacity and that we found that 
7r c is very small. Then, assuming that c is actually infinite is a valid simplifying ap¬ 
proximation. However, if the probability that the system is saturated is far from being 
negligible, then we should use a finite state space. 

Suppose that a certain queueing system may be adequately described by an M/M/1 
queue having c +1 possible states: 0,1,..., c. This model is often denoted by M/M/l/c. 
The balance equations of the system are then the following: 

state j departure rate from j = arrival rate to j 


0 XlTo = /i7Ti 

k = 1, . . . , C - 1 (A + fJb)7Tk = A7T/c_i + /i7Tfc+l 

C flTTc — A7T c _i 

Notice that the balance equations for states j = 0, l,...,c — 1 are identical to the 
corresponding ones in the M/M/l/oo model. When the system has reached its maximum 
capacity, namely c customers, the next state visited will necessarily be c — 1, at an 
exponential rate fi. Moreover, the only way the system may enter state c is from state 
c — 1, when a new customer arrives. 
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Let once again X(t) be the number of customers in the system at time t > 0. 
The stochastic process {X(t),t > 0} is an irreducible birth and death process, as be¬ 
fore. Therefore, instead of solving the previous system of linear equations, subject to 
Xq=o 71 j = 1 [ see (6.5)], we can appeal to Theorem 6.1.1. We still have: 


LTz, 


for k = 1, 2,..., c, 


so that 


and 


7T0 


7Tj — II j 7T o — 


i + LIU n k i + ELi p k 

pi pi 


for j = 1, 2,..., c. 


■? u l+V c n k V c ^ 

1 -r 2^k=i P 2^k =o P 

Because the state space is finite, the limiting probabilities exist for any (positive) 

values of the parameters A and fi. In the particular case when p = 1, the solution is 

simply 

TTj = f or j = 0,1 ,..., c. (6.16) 

That is, when the system is in equilibrium, the c + 1 possible states of the Markov chain 
are equally likely. 

When p 1, we calculate 

i _ n c +1 

TV= - p . 

^ i - p 


k =0 


Hence, we can write that 


P7 ii pi(l-p) 


qC +1 


for i = 0, 


(6.17) 


Remark. The probability that the system is saturated is given by 

p c a-p) 

^ 1 - ' 

Taking the limit as p tends to infinity, we obtain that 
lim tt c = lim = lim 


o-l 


- 1 


p—>oo 


p—>oo 


1 - p c+1 


p—>oc 


p-(c+l) _ I 


= 1, 


so that 7 Tj = 0, for j = 0,1,..., c— 1, as could have been expected. Conversely, we have: 

lim 7 Tq = lim --^-r = 1 


Pi o 


p|0 1 - p c+1 


and 7 Tj = 0, for j = 1, 2,..., c. Finally, if p < 1 and c tends to infinity, we retrieve the 
formula 
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7 Tj = lim 

c —>oo 


pfjl-p) 

1 - p c+1 


p 3 ^- p) 


obtained in the case of the M/M/l/oo model. 


for j = 0,1,... 


Making use of (6.16), we easily find that 


N = — if p = 1. 

2 

In the general case when p ^ 1, we can show that 

_ _P _ (c + 1 )p c+1 

1 — p 1 — p c+1 

Actually, when the system capacity c is small, it is a simple matter to calculate the 
value of N from the formula 


N = E[N ] :=^> n k . 

k =0 

Likewise, after having calculated the 7r/eS, it is not difficult to obtain the variance of the 
random variable N. 

Next, because Ns is equal to 1 if the system in equilibrium is in any state k E 
{1, 2 ,..., c} (and to 0 if the system is empty), the expression for the value of Ns is the 
same as before, namely: 

Ns = 1- 7To, 

which implies that 

Nq = N — 1 + 7T 0 . 

However, the limiting probability 7To is different from the corresponding one in the 
M/M/l/oo model. 

Finally, if we consider only the customers who actually enter the system (in equilib¬ 
rium), we may write that their average entering rate is 

A e = A(1 - 7T C ). 

We then deduce from Little’s formula (6.8) that 


A(1 -7r c )’ 

so that 


A(1 - 7r c ) p 

because S = E[S] = 1/p, as previously. 
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Example 6.2.3. Consider the M/M/1/2 queueing system. That is, the system capacity 
is c = 2. Suppose that A = /i. 

(a) What is the variance of the number of customers in the system in stationary regime? 

(b) What is the average number of arrivals into the system (in stationary regime) during 
the service time of a given customer? 

Solution, (a) We deduce from (6.16) that tto = tti = 7 T 2 = 1/3. It follows that 
E[N] = 1(0 + 1 + 2) = 1 and E[N 2 ] = 1(0 + 1 + 4) = |, 

so that 

VAR [N] - l 2 = 

(b) Let to > 0 be the time instant at which the customer in question begins to be served. 
Then, X(to) is equal to 1 or 2. Because 7r^ = 1/3, we can assert that 

P[X(t 0 ) = 1 I X(to) e { 1 , 2 }] = P[X(to) = 2 I X(to) e { 1 , 2 }] = 1. 

Next, let K be the number of customers who enter the system while the customer 
of interest is being served. Because c = 2, the possible values of the random variable K 
are 0 and 1. That is, K is a Bernoulli random variable. We have, under the condition 
that X(to) £ {1, 2}: 

P[K = 0] = 1 {P[K = 0 | X(t 0 ) = 1] + P[K = 0 | X(t 0 ) = 2]} 

= ^{P[K = 0\X(t 0 ) = l] + l}. 

Moreover, we can write that 

P[K = 0 | X(t 0 ) = 1] = P[N(t 0 + S)~ AT(t 0 ) = 0] = P[N(S) = 0], 

where N(t) is the number of arrivals in the interval [0, t\ and S is the service time of an 
arbitrary customer. Conditioning on the possible values of S', we obtain: 

f-OG 

P[N(S) = 0] = / P[N(S) = 0 | S = s]f s (s)ds. 

Jo 

Because the arrivals of customers and the service times are, by assumption, independent 
random variables, we have: 

POO POO 

P[N(S) = 0] = / P[N(s) = 0 ]fie~' iS ds = / e“ A >e“^ds 
Jo Jo 

H x=(i 1 

fi T A 2 
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It follows that 

m = 0] = i(i + i) 

which implies that P[K = 1] = 1/4 and 


3 

4’ 


E\K]= 0 + lxUI. 


Example 6.2.4. Write the balance equations for the M/M/1/3 queueing system if we 
suppose that when the server finishes serving a customer and there are two customers 
waiting in line, then he serves them both at the same time, at rate fi. 


Solution. Here, the state X(t) of the process cannot simply be the number of cus¬ 
tomers in the system at time t. Indeed, suppose that there are three customers in the 
system. The next state visited will not be the same if two customers are being served 
simultaneously or if two customers are waiting in line. In the former case, the system 
will go from state 3 to state 1, whereas it will move from state 3 to state 2 in the latter 
case. Therefore, we have to be more precise. Let (m, n) be the state of the system if 
there are m customers being served and n waiting to be served. The possible states are 
then: (m, 0), for m — 0,1, 2, and (1, 1). (1 . 2), and (2,1). The balance equations of the 


system are the following (see Figure qffi) 


state (m, n) departure rate from (m, n) = arrival rate to (m, n) 


(0,0) 

A7T(0,0) = MtT(1 ,0) + 7r (2,0)) 

(1,0) 

(A + /i) 7T( 1)0 ) = A7 T(o 5 0) + ^(^(1,1) + 7r (2,l)) 

(1,1) 

(A + /i) 7T(1,1) = ^(qo) 

(1,2) 

/ i7r (l,2) = 

(2,0) 

(A + /i)7T ( 2, 0 ) = JU7T(i >2 ) 

(2,1) 

A fc7r (2,l) = A7T( 2 j 0) 


To obtain the limiting probabilities, we can solve the previous system of linear 
equations, under the condition n) = 1- We express the tt^^s in terms 

of 7T(2 ? i) • For simplicity, we assume that A = /i. Then, the last equation above yields 
that 7T( 2 ,o) = 7r ( 2 ,i)- Next, the equation for state (2,0) implies that 7T( 1j2 ) = 27T( 2 ,i) • It 
follows, using the equation for state (1,2), that we can write that 7T( 1?1 ) = 27T( 2 ,i) as 
well. The equation for state (1,1) enables us to write that tt^ o) = 47T( 2 ,i)- Finally, the 
first equation gives us 7T( 0 ,o) = 57r(2 jv Thus, we have: 


1 

15’ 


(5 + 4 + 2 + 2 + 1 + 1) tt(2,i) — 1 


^( 2 , 1 ) 
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so that 


14 2 _ 1 

7T(0,0) - 3, 7T(1,0) - Yg, 7T(1,1) - ^(1,2) - ^ and 7T (2?0 ) - — • 

Note that this solution also satisfies the equation for state (1,0), which we did not use 
to find the limiting probabilities. 



Fig. 6.3. State transition diagram for the queueing model in Example 6.2.4. 


The average number of customers in the system in stationary regime is given by 


Y1 ( m + n ) 7r ( m,n ) 1 x *-(1,0) + 2 X (7T(1,1) + 7T(2,0) ) + 3 X (tT(1 ,2) + 7T(2,1)) 

(m,n) 


4 + 2x3 + 3x3 19 

15 = 15 


and the average entering rate of customers into the system is 


A e 


A(1 — 7T(i j2 ) 


^(2,1)) = A 



4A 

T‘ 


6.3 Queueing systems with two or more servers 

6.3.1 The M/M/s model 

Suppose that all the assumptions that were made in the formulation of the M/M/1 
queueing model hold, but that there are actually 8 servers in the system, where s G 
{2,3,...}. We assume that the service times of the s servers are independent Exp(/i) 
random variables. This model is denoted by M/M/s. The particular case when the 
number of servers tends to infinity is considered. Moreover, if the system capacity is 
finite, we obtain the M/M/s/c model, which is treated in the next subsection. 
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As is generally the case in practice, we suppose that the customers wait in a single 
line for an idle server (or that they take a number when they enter the system and 
wait until their number is called). This means that if there are at most s customers 
in the system, then they are all being served, which is not necessarily the case if we 
assume that a queue is formed in front of each server. Furthermore, as was implicit in 
the previous section, the service policy is that of first come, first served (denoted by 
FIFO , for First /n, First Out). 

Remark. In the examples and the exercises, we often modify the basic M/M/s model. 
For instance, we may assume that the servers do not necessarily serve at the same rate 
p, or that the service policy is different from the one by default (i.e., FIFO), and so 
on. 


Let X(t) represent the number of customers in the system at time t > 0. The 
stochastic process {X(t),t > 0} is a continuous-time Markov chain. The arrival process 
is a Poisson process with rate A > 0. Furthermore, even though there are at least 
two servers, because the customers are served one at a time and the service times are 
exponential (thus, continuous) random variables, two (or more) customers cannot leave 
the system exactly at the same time instant. It follows that {X{t),t > 0} is a birth and 
death process. The birth rates X k are all equal to A, and the death rates p k are given 
by 


j kp if k = 1,..., s — 1, 
| sp if k = 8, s + 1,... . 


Indeed, when there are k customers being served simultaneously, the time needed for a 
departure to take place is the minimum between k independent Exp (p) random vari¬ 
ables. We know that this minimum has an exponential distribution with parameter 
p + • • • + p = kp (see the remark after Proposition 5.2.1). 

We deduce from wh at pr ecedes that the balance equations for the M/M/s queueing 
system are (see Figure Pl or the case when 8 = 2): 


state j departure rate from j = arrival rate to j 


0 Xno = pni 

k G {1, . . . , 8 — 1} (A + kp)7T k = (k + l)/i7T/c + i + \Hk-l 

k e {s, 8 + 1, . . .} (A + Sp)7T k = spir k+1 + \7T k — i 


To solve this system of linear equations, under the condition XlfcLo n k = 1, we make 
use of Theorem 6.1.1. First, we calculate 


n k = 


A x A x • • • x A 
p x 2p x • • • x kp 


1 

kl 



for k = 1 , 2,..., 8. 


In the case when k = s + 1, s + 2,..., we find that 
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[i 2\x 2g 

Fig. 6.4. State transition diagram for the M/M/2 model. 




= 


s\s k ~ s 

Next, the sum converges if and only if 


E 17 k < 


oc 


oo u 

P k 


k=s +1 


E 

k=s+1 


e! ok S 


< oc 


c OO 7 

jr E (?) 


< OC 


P < s, 


k=s +1 


which, again, is tantamount to saying that the arrival rate of the customers must be 
smaller than their (maximal) service rate. 

Now, because its birth and death rates are strictly positive, the birth and death 
process {X(t), t > 0} is irreducible and the limiting probabilities can indeed be obtained 
from Theorem 6.1.1. We find (after some work) that 


7T0 



p S S 

s! (s - p) 


-1 


if p < s. 


(6.18) 


We may then write that 

k 

n k = j-n 0 if fc = 1,..., s (6.19) 

and 

k 

Tffc = g! gfc- s ^0 if k = s + l,s + 2,... . (6.20) 

Obtaining the quantities N and T requires some effort. First, because the service 
rates are, by assumption, all equal to /i, we can write that S = 1/p. It follows, from 
Little’s formula (with A e = A) that 


N s = XS = p. 


Next, we can show that 


from which we deduce that 


Nq 


nMl 


S 


s\ (s — p) 2 


TTO, 
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— p s ~^^ s 

N = -j- ? - ^TT 0 + P- 

s\ (s — p) z 


Finally, we have: 


Q = ~V~ = I ~YT1 —~“^2 7r ° and T—Q + ~- 

A As! (s — p) z p 


Remark. In some applications, the number of servers is infinite. For example, suppose 
that the customers are people visiting a public park and that their service time is the 
random time that they spend in the park. Because people do not have to wait to be 
served, this situation corresponds to the case when s is infinite. We can write [see (6.18)] 


that 


OO b, 

p 


-1 


(Ejfe! 


Kk=0 


so that 

k 

lim ir k = 7 -j-e _p for k = 1, 2 ,... . (6.21) 

s—>oo k\ 

Hence, if K ~ Poi(p), we may state that the limiting probabilities for the M/M/ oc 
model are given by 

7Tk = P[K = k\ for k = 0,1,... . (6.22) 


It follows that N = p. Finally, because the customers never wait to be served, we have 
that Nq = Q = 0, so that Ns = N = p and T = S = 1/p. 


Now, another quantity of interest is the probability that all servers are busy, when 
the system is in stationary regime. We denote this probability by 7q>. We can show that 


Kb = 5> 
k>s 


Sp s 

• / \ ^0 

s!(s-p) 


if p < s. 


It is possible to express in terms of 7p, the distribution function of the time Q that an 
arbitrary customer waits to be served. The random variable Q is of mixed type. We 
have that P[Q = 0] = 1 — 7^, because 1 — 7p, is the probability that the customer in 
question arrives while there are fewer than s customers already present in the system. 
In general, we can show that, if p < s, 

P[Q <£] = ! — for t > 0. 


Example 6.3.1. Consider the M/M/2 queueing model. Suppose that the two servers 
are busy. Calculate the probability that the service times of the two customers being 
served differ by at most one time unit. 

Solution. Let Si be the service time of the customer being served by server no. i, for 
i = 1,2. The variables Si are independent Exp (p) random variables. By symmetry, we 
can write that 
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P[\s t - S 2 1 < 1] = 2P[S 1 < S 2 < s x + 1 ] 

POO PS 1+1 

= 2 / / fie~^ Sl fie~^ S2 ds 2 ds\ 

J 0 J S\ 

r s i+i l 

lie~^ S2 ds 2 > dsi 


r 00 i 

= 2 (1 - e _M ) / fie~ 2 ^ Sl dsi = 2 (l - e“^) 

Jo ^ 


-MSI J _ ^-MS2 


Thus, the required probability is equal to 1 — e 

Re mark. If we do not use symmetry, we must compute two double integrals (see Fig¬ 


ure 6.5) 


P[|Si - S 2 | < 1] « [ f + lie~^iie-» S2 ds 2 ds x 
Jo Jo 

POO psi+1 

+ / / fie~^ Sl fie~^ S2 ds 2 dsi. 

J 1 Jsi-l 



Fig. 6.5. Figure for Example 6.3.1. 


Example 6.3.2. Suppose that A = 2/i in an M/M/3 queueing system. 

(a) What is the variance of the number of customers in the system at a time to large 
enough for the system to be in stationary regime, given that nobody is waiting in line 
at time to? 

(b) Knowing that the three servers are busy at time to, what is the probability that 
nobody is waiting in line? 
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Solution, (a) First, to being large enough, we can write [see (6.19)] that 


n* k := P[X(to) = k | X(to) < 3] 


__ _ P fc /fc! 

7T 0 + 7Ti + 7T 2 + 7T 3 T, 3 j=0 P /j ! 


p=2 




for k = 0,1,2,3. 


That is, 7 Tq = 3/19, 7 r* = 7 r| = 6/19 and ^3 = 4/19. 

Remark. Note that because the limiting probabilities itk are expressed in terms of 7To, 
we did not need to calculate the value of 7To to obtain 7 t/~. Actually, we have: 

-1 

_ 1 

“ 9 


7T 0 = 


^ 9 k 

Y- 


2 3 


.fc=o 


fc! 3! (3 - 2) 


Next, we calculate 


and 


so that 


E[X(to) | X(to) < 3] — — (6 + 2x6 + 3x4)- 
E[X 2 (t 0 ) | A-(to) < 3] = F ( 6 + 2 2 x 6 + 3 2 x 4) = 
VAR|X( t0 ) | X(to) < 31 = % - (g)’ = |1 


30 

19 

66 
= 19’ 


(b) We look for 


P ■■= P[X(to) = 3 | X(to) > 3] = 


(6.19) 


PjXjto) = 3] 
P[X(to) > 3] ' 
(4/3) 7T 0 


P[X(tp) = 3] 

1 - P[X(t 0 ) < 2] 


1 - (tto + 27T 0 + 27T 0 )' 


So, here we need the explicit value of 7To- Using the previous remark, we can write that 
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4/3 _ 4/3 _ 1 

P ~ tto 1 -(1 + 2 + 2) “ 9- (1 + 2 + 2) “ 3 

6.3.2 The M/M/s/c model 

Even if the capacity, c, of an M/M/s system is finite, the stochastic process {X(t),t > 0} 
remains a birth and death process. Therefore, we can appeal to Theorem 6.1.1 to obtain 
the limiting probabilities of the process. One case is of special importance, namely the 
one for which c = s. The M/M/s/s model is a particular case of what is known as a loss 
system , because when all the servers are busy, the arriving customers cannot (or do not 
want to) enter the system. Consequently, they are lost An example of such a system is 
a parking lot. In this case, the empty parking places are the servers , and when the lot 
is full, arriving drivers must go somewhere else to park their cars. 

Next, we obtain the limiting probabilities for the M/M/s/s model. The balance 
equations of this system are the following: 

state j departure rate from j = arrival rate to j 


0 

k e 1} 

5 


A7To = /i7Ti 

(A + kfJb)7Tk = (fc + l)/X7Tfc+i + — i 

S^7T s AtT^ — I 


The birth and death process {X(t),t > 0}, where X(t) represents the number of 
customers in the system at time £, is irreducible. Moreover, in as much as the sys¬ 
tem capacity is finite, the limiting probabilities exist for all admissible values of the 
parameters A and (i. We calculate 


nk = 


A x A x • • • x A 
fi x 2 /i x • • • x k ii 


jj for k = 1, 2 ,..., s. 


It follows that 


and 


7T 0 = 


1 + Efe=i p k / k[ 


= ££ 
\ fc =0 


-1 


p> f • 

TTj = —7T 0 for J = 1 

r- 


(6.23) 


(6.24) 


Remarks, (i) If s tends to infinity, we should retrieve the results obtained for the 
M/M/oo queue in the previous subsection [see (6.21)]. Indeed, we have: 


lim 7To 

s—>oo 



= (eM = 


-1 
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so that 


,• p 3 ~p 

lim 7 Tj = —e ' 

s —>oo J\ 


for j = 1,2 ,... . 


(ii) The probability tt/, that all servers are busy is given by 


7Tb — 7T S — 


p s /s\ 




(6.25) 


which is known as Erlang’s formula. 

(iii) Because p := X/p = A.E[S], the formulas for the limiting probabilities may be 
rewritten as follows: 


TTO 


He 


(AE[S])‘ 


-1 


Kk=0 


k\ 


and 


(a my 


7T 0 for j = 1, . . . , 5. 


(6.26) 


A very interesting result states that the previous formulas are valid even if the random 
variable S is not exponentially distributed, as long as it is nonnegative. For instance, S 
could have a uniform distribution on the interval (0,1), or a gamma distribution, and so 
on. That is, (6.26) holds for the M/G/s/s model (called the Erlang loss system ), where 
G stands for general. 


To complete this subsection, we calculate the various quantities of interest, which 
turns out to be easy, because Q = 0. It follows at once that Q = Nq = 0. Because 
S = I///, as previously, we can write that 

f = S = - 


and 

N = N s = X e S = A(1 - 7T S )- = (1 - 7T s )p. 

p 

Example 6.3.3. The balance equations for the M/M/2/3 queueing system with A = 2p 
state j departure rate from j = arrival rate to j 


0 2 /JiTTq - piTi 

1 (2/i + /i)7Ti = 2/JjTTq + (2 X /i)7T 2 

2 (2p + 2 X p)tT2 = 2/i7Ti + (2 X p)7Ts 

3 (2 X /i)7T 3 = 2/i7T 2 


are: 
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That is 



37Ti = 27T 0 + 27r 2 



7T3 = 7T2 


It is a simple matter to solve this system of linear equations. The equations for states 
2 and 3 yield that 7 Ti = 7 T 2 = 773 . Then, making use of the equation for state 0, we can 
write that 




Finally, we find at once that this solution satisfies the equation for state 1 as well. 

In the case of the M/M/2/2 queueing system (with A = 2//), we deduce from (6.23) 
and (6.24) that 
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Solved exercises 


Question no. 1 

A system is made up of n components that operate independently of one another and 
all have an exponential lifetime, with parameter /i^, for k = 1,..., n. When the system 
breaks down, the failed components are replaced by new ones. Let N(t) be the number 
of system breakdowns in the interval [0,£]. Is the stochastic process {N(t),t > 0} a 
continuous-time Markov chain if the components are connected (a) in series? (b) in 
parallel? (c) in standby redundancy? 

Question no. 2 

The particular pure birth process known as the Yule process is such that X n = nA, 
for n = 0,1,... .It can be shown that 



What is the expected value of X(t), given that X(0) = i > 0? 
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Question no. 3 

Let {X(t),t > 0} be a birth and death process with state space {0,1,2} and birth 
and death rates given by 

Ao = A, Ai = 2A and /i| = /x, /x 2 = 2/x. 

Find the limiting probabilities of the process from its balance equations. 

Question no. 4 

Find the limiting probabilities of an M/M/1 queue at a (large enough) time instant 
to, given that there are either two, three, or four customers in the system at to- Under 
the same condition, what is the expected time that the first customer who enters the 
system after to will spend waiting in line if we assume that the customer who was being 
served at time to is still present when the new one arrives? 

Question no. 5 

Suppose that the server in an M/M/1 queueing system works twice as fast when 
there are at least three customers in the system, so that /i[X(t)\ = /x if X(t) = 1 or 2, 
and fi[X{t)} = 2/x if X(t) >3. Write the balance equations for this system. What is the 
condition for the existence of the limiting probabilities? 

Question no. 6 

Consider the M/M/1/2 queueing system in stationary regime. Suppose that a depar¬ 
ture took place at time to and that the next two arrivals, from to, occurred at t = to +1 
and t = to + 2. What is the probability that the customer who arrived at time to + 2 
was able to enter the system? 

Question no. 7 

Suppose that the server in an M/M/1/3 queueing system decides to work twice as 
fast, in order to increase the profits of the system. However, after a while, the arrival 
rate of customers goes from A to A/2, because of poor service. Assume that A = fi. If 
every customer who actually enters the system pays $x, what is the average amount of 
money that the system earns per unit of time when the service rate is /x? Is the server 
better off to serve at rate /x or at rate 2fil 

Question no. 8 

Let X(t) be the number of customers at time t in an M/M/2 queueing model. 
Suppose that X(to) > 2, and let Ti be the departure time of the customer being served 
by server no. i, for i m 1,2. Calculate the probability P [?2 < t\ + 1]. 

Question no. 9 

Write the balance equations for the M/M/2/3 queueing system if we suppose that 
the service time of server no. i has an exponential distribution with parameter /x$, for 
i = 1,2. That is, the two servers do not necessarily work at the same speed. Assume that, 
when the system is empty, an arriving customer heads for server no. 1 with probability 
1. In terms of the limiting probabilities of the process, what is the average time that an 
entering customer spends in the system (in stationary regime)? 
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Question no. 10 

Drivers arrive according to a Poisson process with rate A at a service station having 
two gas pumps, but no waiting space for the cars. Suppose that the service time is a 
uniformly distributed random variable on the interval (2,4) for either pump, indepen¬ 
dently from one driver to the other (and that the various service times are independent 
of the interarrival times of the drivers). What are the limiting probabilities of the sys¬ 
tem? What is the variance of the number of cars in the service station in equilibrium if 
A = 1/3? 


Exercises 


Question no. 1 

Let {Ni(t),t > 0} be a Poisson process with rate A^, for i — 1 , 2 . Assume that the 
two Poisson processes are independent. We define 

X(t) = N 1 (t)-N 2 (t) fort >0. 

Is the stochastic process {X(t),t > 0} a continuous-time Markov chain? Is it a birth 
and death process (with state space Z := {...,—2,1, 0 , 1 , 2 ,...})? Justify your answers. 

Question no. 2 

Let {7V(t),t > 0} be a counting process (see Example 6.1.1) for which the time T 
until the first event and between two successive events has a uniform distribution on 
the interval (0,1). Show that the stochastic process {7V*(£),£ > 0}, where N*{ 0) := 0 
and 

N*(t) := N(—lnt) for t > 0 

is a pure birth process. 

Hint See Example 3.4.4. 

Question no. 3 

Consider the birth and death process {X(t),t > 0} having birth and death rates A n = 
A and /i n = ti/a, for n = 0 , 1 , 2 ,... . Calculate, if they exist, the limiting probabilities of 
the process. 

Question no. 4 

Let {N(t),t > 0} be a Poisson process with rate A = In 2 . We define Wi = int(r^ + 1), 
for i = 0,1 ,..., where t* is the time that the process spends in state z, and “int” 
designates the integer part It can be shown that Wi has a geometric distribution with 
parameter p = 1 — e~ x . Calculate the probabilities (a) P[Wq = W\\] (b) P[Wo > W\\\ 
(c) P[W 0 > W\ | W x > 1]. 

Question no. 5 

Suppose that {X(t),t > 0} is a Yule process with A = 2 (see solved exercise no. 2 ). 
Calculate (a) E[rf + + r|]; (b) E[r\ + r| + r|]; (c) the correlation coefficient of 

and rf. 
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Hint. We have (see Example 3.5.2) that 


f 


x n \ e~ Xx dx = — 

\ n 


for n = 0,1 ,... . 


Question no. 6 

Customers arrive at a certain store according to a Poisson process with rate A = 1/2 
per minute. Suppose that each customer stays exactly five minutes in the store. 

(a) What is the expected number of customers in the store at time t = 60? 

(b) What is the expected number of customers in (a), given that the store is not empty 
at time t = 60? 

(c) What is the expected number of customers in (a), given that the number of customers 
in the store at time t = 60 is not equal to 1? 

Question no. 7 

What is the average number of customers in an M/M/1 queueing system in equilib¬ 
rium, given that the number of customers is an odd number? 

Question no. 8 

Consider an M/M/1 queue in equilibrium, with A = /i/2. What is the probability 
that there are more than five customers in the system, given that there are at least two? 

Question no. 9 

Suppose that the service policy for an M/M/1 queue is the following: when the server 
finishes serving a customer, the next one to enter service is picked at random among 
those waiting in line. What is the expected total time that a customer who arrived while 
the system (in equilibrium) was in state 3 spent in the system, given that no customers 
who (possibly) arrived after the customer in question were served before him? 

Question no. 10 

Calculate the variance of the total time T that an arbitrary customer spends in an 
M/M/1 queueing system in equilibrium, given that 1 < T <2, if A = 1 and (i = 2. 

Question no. 11 

Suppose that after having been served, every customer in an M/M/1/2 queueing 
system immediately returns (exactly once) in front of the server (if there was a customer 
waiting to be served, then the returning customer will have to wait until the server is 
free). Define an appropriate state space and write the balance equations of the system. 

Question no. 12 

A customer who was unable to enter an M/M/l/c queueing system (in equilibrium) 
at time to decides to come back at time to + 2. What is the probability that the customer 
in question is then able to enter the system, given that exactly one customer arrived in 
the interval (to, to + 2) and was unable to enter the system as well? 
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Question no. 13 

Suppose that for an otherwise M/M/1/2 queue, a customer who enters the system, 
but has to wait for service, decides to leave if she is still waiting after a random time 
having an exponential distribution with parameter 6. Moreover, this random time is 
independent of the service times and the interarrival times. Let X(t) be the number of 
customers in the system at time t. The stochastic process {X(t),t > 0} is a continuous¬ 
time Markov chain (it is a birth and death process, to be precise), (a) Write the balance 
equations of the process, (b) Calculate the limiting probabilities in the case when A = 

li = 0 = 1 . 

Question no. 14 

Assume that the probability that the server, in an M/M/1/3 queueing system, is 
unable to provide the service requested by an arbitrary customer is equal to p G (0,1), 
independently from one customer to the other. Assume also that the time needed by 
the server to decide whether he will be able or unable to service a given customer is 
an exponentially distributed random variable with parameter /iq. Define an appropriate 
state space and write the balance equations of the system. 

Question no. 15 

Consider an M/M/2 queueing system. Suppose that any customer who accepts to 
pay twice as much as an ordinary customer for her service will get served at rate 2/i. If 
one such customer arrives while there are exactly two customers already present in the 
system, what is the probability that this new customer will spend less than 1//H time 
units (i.e., the average service time of an ordinary customer) in the system? 

Question no. 16 

Assume that when an M/M/2 queueing system is empty, an arriving customer heads 
for server no. 1 with probability 1/2. What is the probability that the first two customers, 
from the initial time, will be served (a) by server no. 1? (b) by different servers? 

Question no. 17 

Let X(t) be the number of customers in an M/M/ 2/4 queueing system. Suppose that 
to is large enough for the system to be in stationary regime. Calculate the conditional 
expectation E[X(to) \ X(to) >0] if A = fi/2. 

Question no. 18 

For an M/M/4/4 queue in equilibrium, with A = /i, what is the variance of the 
number of customers in the system, given that it is not empty? 

Question no. 19 

Assume that in a certain M/M/2/3 queueing system, the service time of an arbitrary 
customer is an exponentially distributed random variable with parameter /i. However, if 
the server has not completed his service after a random time (independent of the actual 
service time) having an exponential distribution with parameter /io, then the customer 
must leave the system, (a) Write the balance equations of the system, (b) What is the 
proportion of customers who must leave the system before having been fully serviced? 
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Question no. 20 

For an M/M/5/5 queue with A = /i, what is the expected value of 1 /A// where N is 
the number of customers in the system in stationary regime, given that the system is 
neither empty nor full? 


Multiple choice questions 


Question no. 1 

Let {X(t),t > 0} be a birth and death process with rates A n = 1 and fi n = 2 . 
Suppose that X(0) = 0. What is the probability that the process returns exactly twice 
to state 0 before visiting state 2 ? 

(a) 1/27 (b) 4/27 (c) 2/9 (d) 1/3 (e) 4/9 

Question no. 2 

Consider the pure death process {X(t),t > 0 } with rates fi n = 1. Calculate the 
probability P[X{ 5) = 0 | X(0) = 5]. 

(a) 0.1755 (b) 0.3840 (c) 0.4405 (d) 0.5595 (e) 0.6160 

Question no. 3 

The continuous-time Markov chain {X(t),t > 0}, having state space {0,1, 2 }, is such 
that Vi = z/, po,i = 1/2, pi,o = 1/4, and /) 2 ,o = 1/4. Calculate the limiting probability 
that the process is not in state 0 . 

(a) 1/5 (b) 2/5 (c) 1/2 (d) 3/5 (e) 4/5 

Question no. 4 

Let T be the total time spent by an arbitrary customer in an M/M/1 queue with 
A = 1 and (i = 3. Calculate E[T 2 | T > to], where to > 0. 

(a) | + to + tg (b) \ + to + t^ (c) 1 + to + t§ (d) 1/4 (e) 1/2 

Question no. 5 

What is the probability that a customer, who left an M/M /1 queue with A = 1 
and (i = 2 before the arrival of the next customer, spent less than one time unit in the 
system? 

(a) |(l - e- 1 ) (b) \ (1 - e~ 2 ) (c) 1 - e " 1 (d) 1 - e “ 2 (e) 1 - |e “ 2 

Question no. 6 

For an M/M/1/2 queueing system, what is the probability that the system was full 
before the first departure took place, given that the second customer arrived at time 
t = 2 ? 

( a ) 577 I 1 — e ~ 2fl ) ( b ) l C 1 - e_2/ ") ( c ) ^ (! - e_/t ) ( d ) 7 (! - e_M ) 

(e) 4 (1 — e ~v — 
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Question no. 7 

Suppose that the M/M/1/2 queueing system is modified as follows: whenever the 
server finishes serving a customer, he is unavailable for a random time r (which is 
independent of the service times and the interarrival times) having an exponential dis¬ 
tribution with parameter 0. Calculate the limiting probability that the server is busy 
serving a customer if A = /i = 1 and 6 = 1/2. 

Hint Define the following states: 

0 : system is empty; server is available; 

0 *: system is empty; server is unavailable; 

1 : one customer is being served; nobody is waiting; 

1 *: one customer is waiting to be served; server is unavailable; 

2 : one customer is being served and another one is waiting; 

2 *: two customers are waiting to be served; server is unavailable. 

(a) 3/49 (b) 13/49 (c) 16/49 (d) 19/49 (e) 20/49 

Question no. 8 

Suppose that there are five customers in an M/M/2 queueing system at a given time 
instant. Find the probability that the three customers waiting in line will not be served 
by the same server. 

(a) 3/8 (b) 1/2 (c) 5/8 (d) 3/4 (e) 7/8 

Question no. 9 

What is the average number of customers in an M/M/ 2/4 queueing system in equi¬ 
librium, with A = /i, given that it is neither full nor empty? 

(a) 9/7 (b) 11/7 (c) 2 (d) 15/7 (e) 17/7 

Question no. 10 

Suppose that the service time for an M/G/ 3/3 queueing system, with A = 1, is a 
random variable S defined by S = Z 4 , where Z ~ N(0,1). What is the average number 
of customers in the system in stationary regime? 

Hint The square of a N(0,1) random variable has a gamma distribution with parameters 
o = A = 1/2. 

(a) 18/13 (b) 41/26 (c) 51/26 (d) 2 (e) 2.5 
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Time series 


In many applications, in particular in economics and in hydrology, people are interested 
in the sequence of values of a certain variable over time. To model the variations of the 
variable of interest, a time series is often used. For example, the flow of a river on a given 
day may be expressed as a function of the flow on the previous days, to which a term 
called noise is added. In the first section, general properties of time series are presented. 
Next, various time series models are studied. Finally, the problem of modeling and using 
time series to forecast future values of the state variable is considered. 

7.1 Introduction 

Let {X n ,n = 0,1,...} be a discrete-time stochastic process. If the index n represents 
time, it is known as a (discrete-time) time series. The random variable X n is called the 
state variable. In this chapter, we are interested in stationary time series. 

Remark. The state variable could actually be a random vector of dimension k. The time 
series would then be a multivariate time series. Likewise, if the index n is a vector, then 
we have a multidimensional or spatial time series. In this book, we only consider the 
case when the state variable is a random variable and the index n is a scalar. 

Definition 7.1.1. The stochastic process {X n ,n = 0,1,...} is said to be (strictly) 
stationary if the distribution of the random vector (X ni+m ,X n2+m , ...,X nfc+m ) is 
the same for all m E {0,1,...} and for all k = 1,2,..., where nj E {0,1,...} for 

j = !,•••, k. 

Remarks, (i) In words, the process {X n , n = 0,1,...} is stationary if the distribution of 
the random vector ( X ni , X n2 ,..., X Uk ) is invariant under any shift of length m to the 
right. 

(ii) If we consider the process {X n ,n = ...,—1,0,1,...}, then m can be any positive 
or negative integer in the definition, which is tantamount to saying that the time origin 
can be placed anywhere. 
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Because it is generally difficult to prove that a given stochastic process is stationary, 
we often content ourselves with a weaker version of the definition. Note first that if 
{X n , n = 0,1,...} is stationary, then we may write that 

P[X n < x] = P[X o < x\ for any n G {1,2,.. 

which follows from the above definition with k = 1. Hence, we deduce that the expected 
value of X n must not depend on n: 

E[X n \ := /i for all n G {0,1,2,...}. 

Next, if we set k = 2 in the definition, we obtain that 

P[X n i ^ ^l5^"n2 — *^ 2 ] = -P[^ni+m ^ ^ ^ 2 ] 

for any 721,722 G {0,1,...} and for m G {0,1,...}. This time, we deduce that the 
distribution of the random vector (X ni ,X n2 ) depends only on the difference n 2 — n\ (or 
ni — 712 ), which implies that 

COV[X ni ,X n2 ] := 7(722 - nf) = 7(721 - n 2 ) 

for all 721 , n 2 G {0,1,2,...}. If we choose n\ = i and n 2 = i + j, then we may write that 

COV[I„lG^7(i) = 7H), (7.1) 

which also follows from the equation 


CO Y[Xi,X i+j ] = CO V[X i+j ,Xi\. 


Hence, CO Y[X u X i+j ] = COV^,^^-] if j < i. 

Definition 7.1.2. The function y(-) is called the (auto) covariance function of the 
stationary stochastic process {X n ,n = 0,1,...}. 

Remark. If we assume that fi = 0, or if we consider the centered process {F n , 72 = 0,1,...} 
defined by 

Y n = X n - fi for 72 = 0,1,..., 

then y(-) is also the autocorrelation function of the process. However, in the context of 
time series, this term is generally used for another function (that is defined below). 

Definition 7.1.3. If the stochastic process {X n ,n = 0,1,...} is such that its mean 
E[X n \ is equal to a constant /i G M for all n and Equation (7.1) is satisfied for all 
i G {0,1,. ..} and j G {—i, — i + 1,. ..}, then it is said to be weakly or second-order 
stationary. 
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Remarks, (i) Notice that 


VAR[X n ] = COV[X n , X n \ = 7 (0), 

which is independent of n. We denote the variance of X n by a 2 (or by o\ where necessary 
to avoid possible confusion). 

(ii) Equation (7.1) implies that the function y(-) is an even function. Moreover, it can 
be shown that 

|COV[X n ,X m ]| < STD[X n ]STD[X m ] 

(because the correlation coefficient of X n and X m belongs to the interval [—1,1]). It 
follows that 

|COV[X„,X n+ i]| =° | 7 (i)| < STD[X n ]STD[X n+i ] = 7 (0) = <r 2 . 

Therefore, the function y(-) is bounded above by the positive constant a 2 . 

(iii) In the case of a continuous-time stochastic process {X(t),t > 0}, the condition (7.1) 
becomes 


COV[X(t),X(t + 5 )] = 7 (s) = 7 (—s) for allt > 0 and s > —t. 

Furthermore, we may then assume that y(-) is a continuous function. 

Definition 7.1.4. Let {X n ,n = 0,1,...} be a weakly stationary stochastic process. The 
function 


p(k) 


7 (fc) 7 (fe) 


7(0) cr 2 

is called the autocorrelation function of the process. 


Remarks, (i) The function p(-) is sometimes denoted by ACF. Likewise, some authors 
write ACVF for the auto covariance function. 

(ii) The autocorrelation function is such that 

— 1 < p(k) < 1 and p( 0 ) = 1 . 

Moreover, because 7 (k) = 7(— fc), we also have that p(k) = p(—k). Therefore, it is not 
necessary to calculate the autocorrelation function for negative values of k. 

Example 7.1.1. Suppose that the random variables Xq,Xi,... are independent and 
identically distributed. In addition, assume that E[X n \ = 0 and VAR[X n ] = a 2 < 00 . 
Then, we have that (for i,i + j E {0,1,...}) 

COV[X i; X i+j ] = E[XiX i+j } - 0 2 ! =- E[Xi}E[X i+j ] = 0 if j ± 0. 


In the case when j = 0, we can write that 
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CO \[Xi,X i+j ] j = CO V[Xi,Xi] = VAR[Xj] = <r 2 . 


That is 



0 if n 2 — ni 7 ^ 0 , 
a 2 if ri2 — n\ = 0 . 


Because the mean E[X n ] of the random variables is a constant and the covariance 
COV[A ni , X n2 \ depends only on the difference n^ — nx, the stochastic process {X n , n = 
0,1,...} is an example of a weakly stationary process. It is called an i.i.d. noise (with 
zero mean) and is denoted by IID(0,cr 2 ). 

Example 7.1.2. If the stochastic process {X n ,n = 0,1,...} having as state space the 
set { 0 , 1 , 2 ,...} is such that 

P[X n -\~1 = j | X n = X n —i = i n _i, . • •, Xq = io] = P[X n -\-i = j | X n = i\ 

for all states io,..., i n _i, i, j G {0,1,...} and for n = 0,1,..., then {X n , n = 0,1,...} 
is called a (discrete-time) Markov chain. Assume, in addition, that we can write that 

P[Xn -)-i = j | X n = i] := 

That is, the conditional probability does not depend on n. Then, the Markov chain is 
said to be time-homogeneous. In general, the probability that the Markov chain will 
move from state i to state j in r steps is given by 


P[X n+r = j | X n = i\ := P { V 


for any n. 

Suppose that the limit 


nj := lim P[X n =j\X 0 =i] 

n—> oo 


exists and does not depend on the initial state i. If we also suppose that 
P[X o = i\ = 7Ti for all i E {0,1,.. 

then we can show that 

P[X n = i\ = 7ii for n = 0,1,... and for all i G {0,1,.. 

Hence, Definition 7.1.1 is satisfied for k = 1. Actually, we can show that the Markov 
chain {X n , n = 0,1,...} is (strictly) stationary. In particular, if n<i > ni, we have that 

P[Xm+m = t ^n 2 +m = j] = P[X n2 -\-m = j \ X rLl J rTn = i] P[X Ul -|_ m = i] 



Note that the joint probability P[X ni+m = i,X n2+m 

{ 0 , 1 ,...}. 


j] does not depend on m G 
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The process in Example 7.1.1 can be generalized as follows. 

Definition 7.1.5. Suppose that the random variables Ah, Ah ,... are such that E[X n \ = 
0 and VAR[X n \ = a 2 < oo. If the X n s are uncorrelated and identically distributed, then 
the stochastic process {X n , n = 0,1,...} is called a white noise (process) and is denoted 
by WN(0,a 2 ). 

Remarks, (i) An i.i.d. noise is a white noise process, because if two random variables 
are independent, then they are also uncorrelated. However, the converse is not always 
true. 

(ii) A very important particular case is the one when the X n s all have a Gaussian 
distribution. We have seen that, in such a case, the random variables are independent if 
and only if their (covariance or) correlation coefficient is equal to 0. Hence, a Gaussian 
white noise is an i.i.d. noise. We denote this process by GWN(0,cr 2 ). 

In Chapter 3, we defined the Gaussian or normal distribution (see p. 70). We now 
generalize this definition. 

Definition 7.1.6. Let Z±,..., Z m be independent N{ 0,1) random variables, and let Hk 
be a real constant, for k = 1,..., n. If the random variable X & is defined by 

m 

x k = Mfc + CkjZj for k = 1,..., n, 

3 = 1 

where the CkjS are real constants, then the random vector (Xi, Ah,..., X n ) has a multi¬ 
nomial or multivariate normal distribution. 

Remarks, (i) We say that X^ is a linear combination of the random variables Z\,..., Z m . 
(ii) Equivalently, (Ah, Ah,..., X n ) has a multinormal distribution if and only if 


a\X\ + • • ♦ + a n X n 

has a normal distribution for all real constants a%,... ,a n (with at least one ai 7 ^ 0 , 
although we can actually take ai = 0 because the constant 0 can be considered as a 
degenerate normal random variable). 

It can be shown that the joint density function of the random vector X := 
(W • • -,X n ) is given by 

- m)C " (xT - (7 - 2) 

for x := (xi ,... ,x n ) G M n , where 


m := (E[Xi],...,E[X n ]) = (m 
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C:= (C()V!.V,,.V,|), i _ | .„ 

and T denotes the transpose of the vector. 

Remarks, (i) The matrix C is called the covariance matrix of the random vector X. 
Actually, it must be nonsingular for the above expression to be valid. That is, the 
determinant of C must be different from zero. 

(ii) Note that the distribution of X depends only on the vector of means m and the 
covariance matrix C. 

(iii) We use the following notation: X ~ N(m, C). 

Properties, (i) If the random vector (Xi,X 2 , ... ,X n ) has a multinormal distribution, 
then any linear combination of the s also has a multinormal distribution. 

(ii) If, in addition, COV[W, Xj\ =0, then Xi and Xj are independent random variables. 

Definition 7.1.7. A stochastic process {X n ,n = 0,1,...} is said to be Gaussian if 
the random vector (X ni , ..., X nfc ) has a multinormal distribution for all n\, ..., G 
{0,1,...} and for any k G {1, 2,...}. 

Remarks, (i) The definition can be extended to the case when {X{t),t G T}, where 
T C 1, is a continuous-time stochastic process. 

(ii) Any affine transformation { Y n , n = 0,1,...} of a Gaussian process {X n , n = 0,1,...} 
is also a Gaussian process. That is, Y n is defined by 

Y n = aX n + (3 for n = 0,1,..., 

where a 0) and (3 are constants. The process {Z n , n = 0,1,...} defined by 

Z n = aX 2n + (3 for ra = 0,1,..., 
for instance, is a Gaussian process as well. However, if we set 

Z n = Xl for ra = 0,1,... * 

then {Z n , n = 0,1,...} is clearly not a Gaussian process because Z n > 0, which implies 
that Z n does not have a Gaussian distribution. 

A reason why Gaussian processes are so useful is given in the next proposition. 

Proposition 7.1.1. If a Gaussian process {X n , n = 0,1,...} is weakly stationary, then 
it is also (strictly) stationary. 

Proof. The proof is based on the fact that a multivariate normal distribution is com¬ 
pletely determined by its vector of means and its covariance matrix, as mentioned above. 
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Example 7.1.3. If (Xi,X 2 ) is a random vector having a multinormal distribution (i.e., 
a bivariate normal or binormal distribution , to be precise), then its joint probability 
density function can be written as follows: 

fx li X 2 (xi,X 2 ) = 


c exp 


2(1 ~P 2 Xl ,x 2 ) 


2 

E 

i= 1 


PXi 


OXi 


2 pXi,X 2 


{ X 1 - - PX 2 ) 


VX 1 &X 2 


for any (xi,x 2 ) G M 2 , where pxi G M and > 0, for i = 1,2, and —1 < px 1: x 2 < 1- 
Moreover, c is a positive constant defined by 


27rcr Xl crx 2 ( 1 ~ PX 1} X 2 ) 1/2 


(see Figure 7.1) 



Fig. 7.1. Joint probability density functions of a random vector having a bivariate normal 
distribution with jj,x i — 0, (ix 2 = 1 ? crxi — 1 , crx 2 — 4, px 1 ,x 2 = 1/2 (left), and px 1} x 2 =5/6 
(right). 


Because every linear combination of the random variables X\ and X 2 has a Gaussian 
distribution, we can state that Xi and X 2 themselves are Gaussian random variables. 
We find that E[Xf\ = pxi and VAR[JG] = for i = 1,2, which can be checked 
by integrating the joint density function fx ± ,x 2 { x h ^ 2 ) with respect to the appropriate 
variable. It follows that 


fx 2 (X 2 | Xi 
= c* exp | - 


X _ fx 1 ,X 2 (xi,X 2 ) 

Xl) ~ fxM 


1 

2 ( 1 -P| 1 ? x 2 ) 


X2 ~ PX 2 \ 
°X 2 ) 


Px 1 ,X 2 


Xl - Px x 
°X X 



where 


V2tt(Tx 2 (1 - p 2 Xl ,x 2 ) 1/2 


That is, we can write that 
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*2 | {X 1 = xi} ~ N 



+ PX !,X 2 


&x 2 

VX, 


(xi - /2 Xl ), a X 2 (l - Px u x 2 



(7.3) 


Remark. Note that if px 1 ,x 2 = 0, then fx 2 {%2 | X\ = x\) = fx 2 {x 2 ), which confirms 
the fact that if the correlation coefficient of two Gaussian random variables is equal to 
zero, then they are independent random variables (see p. 133). 

Example 7.1.4. Let {X(£),£ > 0} be a Gaussian process with stationary and indepen¬ 
dent increments such that X(0) = 0 and X(t) ~ N(0,£), for t > 0. Define 


f 1 if X(n) > 0 , 
\-lifX(n) <0 


for n = 1, 2 ,... .Is the stochastic process {Y n ,n = 1 , 2 ,...} weakly stationary? 
Solution. We have that 

E[Y n ] = 1 • P[X n > 0] + (-1) • P[X n < 0] = 1 T + (-1) T = 0 Vn € {1,2,.. 
Then, 

CO V[Y n ,Yn +m ] = E[Y n Y n+m }. 

Next, by symmetry and continuity, we can write (for m G {0,1,...}) that 
E[Y n Y n+m ] 


= 2 P[X(n) > 0 , X(n + m) > 0 ] — 2 P[X(n) > 0 , X(n + m) < 0 ] 

= 2 {P[X{n + m) > 0 | X(n) > 0] - P[X{n + m) < 0 | X(n) > 0]} 
x P[X(n) > 0 ] 

s -v-' 

1/2 

= 2 P[X(n + m) > 0 | X(n) > 0 ] — 1 

POO 

= 2 / P[X(n + m) > 0 | X(n) > 0, X(n) = u\f X { n )( u \ X(n) > 0 )du — 1, 

Jo 

where 

/*<”)<“ 1 X(n) -° } = Pmn) U >0] = % « w = vls exp {T 

for u > 0. It follows, using the fact that {X(t),t > 0} has stationary and independent 
increments, that 

E[Y n Y n+rn ] = 4 / P[N(0,m) >- u \-= exp {-—\du-l. 

Jo V 27rn L 2n J 
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Now, 


P [N(0, m) > — u\ = 1 — $(— u/y/m) = @(u/y/m) = J 


u/y/m 2 


5 + i 


it/v/m -1 

J= e -* 2 /i dz y =^ 

2 


y=z/y /2 


1 r 

2 + /o 


t/\/ 2 m 




) \/27T 

~ y2 dy 


2 / 2 dz 


= ^ + ^erf(w/ V2m), 


where erf is called the error function. Hence, 


7 °° 1 r 1 1 f "I 

£[F n F n+m ] = 4 / - 1 + erf(?x/v / 2m) — exp < - — } du - 

Jo 2 L J v 27m L 2n J 

,2 


y y- ' 

-I V27T n 

= 2 <[-+ [ erf (u/V2m) ^ exp <f — — 1 dul — 

12 Jo V 7 l 2nJ / 

2 f°° ( u 2 ) 

= . / erf(W\/2ra) exp <-> du. 

V2^Jo l 2nJ 

Finally, making use of a mathematical software package, such as Map/e, we find that 
the above integral is given by 




arctan( \Jnj\Jm ), 


so that 

2 

COV[Y^, = E\Y n Y n -|_ m ] = arctan(y / n/y^)- 

7r 

COV[F n , T n+m ] depends on n, therefore the stochastic process {T n , n = 1,2 ,...} is not 
weakly stationary. 

Remark. The stochastic process {X(t),t > 0} is known as a standard Brownian motion 
and is very important for the applications. 


7.2 Particular time series models 

7.2.1 Autoregressive processes 

Let {T n , n = 0,1,...} be a time series such that E[Y n ] = /i. We define 

X n = Y n - n for n = 0,1,..., 

so that E[X n \ = 0. A simple model is the one for which the state variable X n is defined 
by 
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X n = ol\X u -\ + e n for n = 1,2,..., (7.4) 

where aq 0) is a constant and {e n , n = 1, 2 ,...} is a white noise WN(0, a 2 ) process. 
Furthermore, e n is independent of (or is at least uncorrelated with) X n _i, X n _ 2 , ..., 
Xq. The stochastic process {X n ,n = 0,1,...} is called an autoregressive process. We 
have that 


X n — ct\X n -\ + e n — OL\{a\X n -2 + e n _i) + e n — a\X n _2 + aqe n _i + < 




— • • • — (XI Xq + ^ ^ Qq 
2=1 

Now, if we define eo = Yo, then we can write that 

n 

X n = J2<~Xi for n = 0,1,2 ,... . 

i=0 

It follows (with k £ {0,1,...}) that 
COV[X n ,X n+k } = COV 


= cov 


= cov 


n n-\-k 

\ A ,, n-\-k—i _ 

°d e *’ Z_^ °d e * 

. 2=0 2 = 0 

n n n-\-k 

J2a n r i e i ,J2 a i +k ~ i ti+ E 

2=72+1 


2 = 0 


2 = 0 


(7.5) 


5>rZE< +fc “ ie * 


.2 = 0 


2 = 0 


COY 


n n-\-k 

E«r ie - E < +fc_ie * 

.2=0 2 = 72+1 


The random variables e* being uncorrelated, the second term above is equal to 0, so 


that 


COX[X n ,X n+k ] = cov 


n 


5>r 


72 


E«i +fc- ^ 


= E 



'E<*r i o$ +k - i E[<*] 

2=0 


because 

£+ + = £++[+ = 0 if * 7+ • 
Assume that — 1 < oq < 1 and that 

2 

f+o] = VAR[e 0 ] = ■ 
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Then, 

COX[X n ,X n+k ] =a\ n+k 
Indeed, we have that 




-2 i 2 


a? 


1 — a\ 

, 1 i=l 


—2 , _—4 




(7.6) 


Ui 


<a 


:=^ a - 2i = ai 

2=1 

_ —2 rr _ _ —4 | | ^ —2n , ^ —2n —2 

oq o n — oq T * * * T oq T oq 

^ —2\ c_-2 _ —2n—2 

(1 Qq J — Qq Qq 


—2n 


_ a x 2 - a x 2n 2 
— 


1 — Qq 

from which Equation (7.6) follows at once. 
Similarly, 

n 


-2 


ai 2n - 1 
1 — a 2 


(7.7) 


COV[X„,X n _ fe ] =cov 

_ 2 = 0 

~n—k n—k 


n—k 




2 = 0 


u =- cov 


n—k 


£«r fc -v 
2=0 2=0 
2 — k 2 

2n—k \ ' —2i 7T>r^2i 2 n—k ' 

= a l 2^ a 1 a l 


2 = 0 

( 1 


2 = 0 


1 — <a 2 


+ 


— 2 (n—k) i 

aq v — 1 


1 — <a 2 


1 -af 


aj for /c = 0 , 1 ,..., n. 


Given that COV[X n ,can be written as 7 (fc) (and i2[X n ] = 0), we may as¬ 
sert that the stochastic process {X n ,n = 0 , 1 ,...} is a weakly stationary time series. 
Furthermore, we have that 

2 

ajc = VAR[X n ] = COX[X n ,X n ] = j for n = 0,1, • • •, 

I — Oi\ 

which implies that 

Px n ,x n+k - oi\ for n,k e {0,1,...}. (7.8) 

Remarks, (i) Let Z = {...,—1,0,1,...}. If we consider the stochastic processes {X n , n G 
Z} and {e n ,n G Z}, where VAR[e n ] = a 2 , then we can write that 


x n = a*! e n _i for n e Z 


(7.9) 


i=0 


and we retrieve the formula for COV[X n , X n _^_ k ] if we assume that |aq| < 1, as above. 
Indeed, first we have that 
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E[X n ] = E 


^ ^ 0^1 €n—i 


U=0 


= 0 for all n G Z. 


Then, using the fact that {e n ,n G Z} is a WN(0,cr 2 ) process, we can write that 


XAR[X n ] = E[X 2 } = E ^>*e n _ 

\i =0 

O 0 q X 1 

- 2 “if 1 / 1 

^ 1 \1— <*1 
i=0 x 1 




i=0 

cr 2 for all n G Z. 


(7.10) 


Finally, 

C0V[x n ,x n+ ,] = £[x n x n+ ,] = f; 

oo 

= ^ai +(fc+i T[ e G i ] = E^ < 

i=0 z=0 

if k G {0,1,...}. When k = —1, —2,..., we have that 
CO V[X n ,X n+k ] = E 


J2 a Un-i J2 ^1 e n+k—j 

i =0 / \j=0 


2i+/c^ r 2 _ 


a? 


1 -af 


5>l*n-< ) ( El e n+k-j 

j =0 


vi=0 


= E 




\i=k 


oo 

El ^l e n+/c-j 

Vi=° 


= EEl 7 ^Ec-j] =cr2 


-fc 


3=0 

Hence, we can write that 


1 -ol{ 


COX{X n , X n+k ] 


a 


\k\ 


1 — a 2 


a 2 for all n, fc G Z. 


(7.11) 


(ii) Suppose that the independent random variables e n are such that 

P[e n = 1] = Po = 1 - ^[e n = -1] for all n G {1, 2 ...}, 
where po G (0,1). Then, the stochastic process { X n , n = 0,1,...} defined by 
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X n = X n —\ + e n for n = 1,2,... 

(with e n independent of X n _i, ..., Xq, for n = 1, 2,.. .) is called a random walk. The 
initial state Xq is often chosen to be 0. Note that if po = 1/2, so that E[e n ] = 0, 
{X n , n = 0,1,.. .} is an autoregressive process with a\ =» 1. However, it is not stationary, 
which can be seen as follows: 

n 

X n — X n —i + e n = X n _2 + e n _i + e n = • • • = Xo + Q, 

i=i 


so that 

n 

E[X n \ = E[X 0 ] + 5^£7[ Ci ] = £7[X 0 ] 

i=1 

a constant, but 


n 

VAR[X n ] *=• VAR[X 0 ] + ^ VAR[ Ci ] = VAR[X 0 ] + ncr 2 , 

where cr 2 = 1 — 0 2 = 1. Because the variance of X n depends on n, the process {X n , n = 
0 , 1 ,...} is not stationary (not even weakly). 

The time series defined by Equation (7.4) is known as an autoregressive process of 
order 7, because it assumes that X n depends only on one value of the process into the 
past. It can be generalized, as follows. 

Definition 7.2.1. The stochastic process {X n ,n G Z} defined by 

p 

X n = '^2a i X n -i + e n for all n E Z ={...,-1,0,1,...}, (7-12) 

i =1 

where the ais are constants , {e n ,n G Z} is a white noise TT7V(0,cr 2 ) process, and e n is 
independent of X n - 1 , X n _ 2 , • • •, is called an autoregressive process of order p and 
is denoted by AR(p). 

Remarks, (i) If we consider the time series from n = 0 only, Equation (7.12) is then 
valid for n = p,p + 1 ,... . Alternatively, we can set X n _i = 0 if n — i < 0 . 

(ii) We could add a constant term oo in (7.12). 

To obtain the variance of an autoregressive process of order 1 , given in Equa¬ 
tion (7.10), we can also proceed as follows: because (by assumption) E[X n \ = 0, we 
have that 
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VAR[X n ] = E[Xl\ = E [(a!X n _! + e n ) 2 ] 

= aiE[X 2 _i] + 2ai_E[A„_ie n ] + E[e^\ 

'=• a\E[X 2 n _ i] + 2a 1 E[X n _ 1 }E[e n ] + a 2 
= ajEiX^+a 2 . 

Next, if we assume that the time series is weakly stationary, then we can write that 
E[X%] = E[X o], which implies that 

2 

E[X 2 1 = ajElX 2 ] + a 2 =► VARfATJ = —for all n € Z. 

1 — ay 


Remark. We see that the condition |aq| < 1 must be fulfilled, otherwise the variance of 
X n would be negative (if |aq| > 1). 

Likewise, 

COV[X n ,X n _„] = E[X n X n _ k ] = E[(a 1 X n _ 1 +e n )X n _ k ] 

= QL\E\X n —\X n — k ] T E\e n X n — k \ 

'"= ai COV[X n _ u X n _ k ] for A = 1,2,... . 

Hence, assuming again that the time series is weakly stationary, we obtain that 

7 (k) - ai 7 (k — 1 ) = 077 (k — 2) = • • • = 077 ( 0 ) 

2 

=> 7(A) = af for a11 k e 

which agrees with Equation (7.11) [because 7 (—fc) = 7(fc)]. 

Now, for a weakly stationary AR(p) process (with E[X n \ =0), we have that 


COV[X n ,X n _ fc ] = E[X n X n _ k ] = E 


^ ^ X n —i T e n J X n —k 


_ \2=1 


= ajE[X n _iX n _ k ] + E[e n X n _ k \ 

2=1 

V 

^■^QiCOVlVi,^] for* = 1,2,... 
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The previous equation can be rewritten as follows: 

p 

l{k) = Y^oia{k - i) (7.13) 

P 

=> p(k) = ^^aip(k — i) for k = 1, 2,... . (7.14) 

It can be shown that the general solution of Equation (7.14), together with p{ 0) = 1, is 
given by 

p 

p(k) = ^2 c i\^ for k =s0,1, 2,..., (7.15) 

i= 1 

where the <qs are constants and the A^s are the roots of the equation 

p 

X p =0. (7.16) 

i=1 

Remark. We assume that the p roots of the above equation are distinct. 

To determine the constants < 7 , for i = 1,... ,p, we can use the fact that p(0) = 1 
and the various equations obtained by setting, recursively, k equal to 1 , 2 ,... ,p — 1 in 
(7.14) [assuming that the p roots of Equation (7.16) have been found]. 

Remarks, (i) It can be shown that the condition |A^| < 1 must be fulfilled for i = 1,... ,p 
for the AR(p) process to be weakly stationary. We then find that 

lim 7 (k) = lim p{k) = 0 . 

k—>oo k—>oo 

(ii) The equations (7.14) are known as the Yule-Walker equations. 

(iii) If we define 2 = 1/A (assuming that A ^ 0), then Equation (7.16) becomes 

p p 

z~ p ~J2 a i zi ~ P = 0 1 — ^ onz i = 0 . (7.17) 

i= 1 i=1 

The condition for the time series to be weakly stationary is now that \z\ must be strictly 
greater than 1 for all roots of the equation. 

Example 7.2.1. In the case of a weakly stationary AR(1) process, Equation (7.16) 
becomes 

A = aq, 

which implies that Equation (7.15) is 

p{k) — ciaq for k m 0 , 1 , 2 ,... . 

The condition p(0) = 1 yields at once that c\ = 1. Hence, we can write that 
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p(k) = af[ for k = 0 , 1 , 2 ,... 


[see Equation (7.8)]. 

Remarks, (i) Notice that, because |aq | must be strictly smaller than 1 , we deduce that the 
autocorrelation function of a weakly stationary AR( 1 ) process decreases geometrically 
(in absolute value). Furthermore, if 0 < aq < 1, then p(k) is always positive, whereas 
in the case when — 1 < oq < 0 , the function p(k) alternates between geometrically 
decreasing positive and negative values. 

(ii) Substituting p(k) = a\ into Equation (7.14) with p = 1 , we obtain that 




oqoq 1 


= a 


k 

1 5 


so that the equation is indeed satisfied. 

Example 7.2.2. To obtain the autocorrelation function of a weakly stationary AR( 2 ) 
process, we must first find the two roots of the equation [see (7.16)] 

A 2 — Gl\ A — 0,2 = 0. 


We have that 


Let 


A =2 


2 ( Cn ± 


+ 4 0^2 


Ai = 


1 

2 



and A 2 


1 

2 




We can write that 

p(k) = ci Xi + C 2 A 2 for k =5 0, 1 , 2 ,... . 

Again, p(0) = 1 yields that c\ + C 2 = 1, so that 

p(k) = A^ + ci (Af - A 2 ) for k = 0, 1 , 2 ,... . 

Next, making use of Equation (7.14) with p = 2 and & = 1, we get that 


p(l) = oqp(0) + «2P(-1) = cki + «2P(1), 


which implies that 


Hence (see above), 


P(l) = 


Qi 

1 — 0^2 


1 - <a 2 


p(l) = A 2 + ci(Ai - A 2 ) and p( 1) = 
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which yields 


ci 


[qji /(1 — 0 . 2 )] — A 2 

Ai — A 2 


and C 2 


Ai — [aq/(1 — oi2)\ 

Ai — A 2 


Now, the variance of a weakly stationary AR(p) process having zero mean can be 
obtained as follows: 


VAR[X n ] = E[X 2 n \ = E 


^ ^ Q-jXn—i ~\~ €n J -Vi 


,2=1 


P 

= Y J a iE[X n -iX n }+ E[e n X n \ 

2=1 

*=' V[X n -i,X n ]+E[e 2 n ] 

2=1 

= ^ 2 ota(i) +cr 2 . 

2=1 


(7.18) 


We can use this equation, together with (7.13), to find an explicit expression for 
VAR[X n ]. 

Example 7.2.3. Consider again a weakly stationary AR( 2 ) process. From Equa¬ 
tion (7.18), 

VAR[X„] = 7 ( 0 ) = £* 17 ( 1 ) + 0 : 27 ( 2 ) + O' 2 . (7-19) 

Moreover, Equation (7.13), with p = 2 and k = 1, and the fact that y(— j) = 7 (j) imply 
that 

QO 

7(1) = Ol7(0) + 0 2 7(-l) => 7(1) = T-7(0), 

1 — 0^2 

whereas the same equation with p = 2 and k = 2 gives 

7 ( 2 ) = 017 ( 1 ) + 027 ( 0 ) => 7(2) = ( -^ 2 ) 7 ( 0 )- 

\ 1 - o 2 / 

Finally, substituting into (7.19), we find that 

7(0) = ( + rvr + 7 ( 0 )+ 0 2 , 

\ 1 — <^2 1 — OL 2 J 

so that 


Remark. If we add the constant a$ in Equation (7.12), then the mean of a weakly 
stationary AR(p) process is given by 
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That is, 


p 

E[X n \ = <ao + ot.iE[X n -j\ + E[e n ] 

i=1 


P 

o^o + cq E [X n \. 

i=i 


E[X n 


a 0 


1 - ELi ’ 


7.2.2 Moving average processes 

Definition 7.2.2. Let {e n ,n E Z} 6e a white noise W7V(0,cr 2 ) process. The time series 
{X n ,n E Z} defined by 


q 

X n = e n + E 9ie n -i for all nEZ, (7.20) 

i=i 

where the OiS are constants, is called a moving average process of order q and is 
denoted by MA(q). 

Remarks, (i) We could add the constant /a in the model. However, we prefer to work 
with the centered process {X n ,n E Z}. 

(ii) Setting 6$ = 1, Equation (7.20) can be rewritten as follows: 

q 

X n = Oi e n —i for all n E Z. 

z=0 

(iii) We deduce from Equation (7.9) that an AR(1) process can be written as an MA(q) 
time series with q tending to infinity and with coefficients Oi equal to a\, for i = 0,1 ,... . 

The mean and the variance of an MA(g) time series are easy to calculate. First, we 
(indeed) have that 


q 

E[X n \ = OjE[e n -i\ = 0 for all n E Z. 

i=0 


Next, 


VAR[X n ] = E[Xl) = E 



Because {e n ,n E Z} is a white noise process, we can write that 


VAR[X„] = = E^ 2 - 

i=0 i=0 


(7.21) 
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Now, the covariance of the random variables X n and X n +k is given by 


CO V[X n ,X n+k ]=E 


N ®i e n-i ) ( N ^ 


i^n+k—i 


\i=0 


\i=0 


Again, the fact that {e n ,n G Z} is a white noise process implies that 
COV[X n , X n +k] =0 if k «= q + 1, q + 2,... . 
When k = 1,..., g, we obtain that 

q \ / <?-£; 


COV[X n ,X n+fc ] =£ 


@j+k £n—j 


\i=0 


q—k q—k 

= Y,E[e l e i+k e 2 n _ i ]=Y,o l e i+k a 2 . 

i =0 z=0 


In general, we find that 


COVfY X ,1 -/ 0-2 Y%=o Wi+W if l fc l = °a, •••,?, / 7 o ? \ 

COV[X n ,X, l+A; J - | ° 0 if |fc| =9 + i,g + 2. (7 ' 22) 

Hence, we can state that the time series {X n ,n G Z} is weakly stationary. The autocor¬ 
relation function of a moving average process of order q is therefore 


Ed fe| ^+i fc i. 


p{k) = 


if \k\ = 0,1 


(7.23) 


E q /Q 2 

i=0 U i 

0 if \k\ = q + 1, q + 2 ,... . 

Particular cases, (i) The autocorrelation function of an MA(1) process is given by 


p{k) = < 

(ii) If q = 2, we obtain that 


1 if k = 0, 

9l if \k\ = 1, 


1 + 02 
0 if |fc| = 2,3, 


(7.24) 


p{k) = < 


1 if k m 0, 
0l(1 + ^ 2) if |fc| si, 

if \k\ = 2, 


i + e 2 + ei 

02 


(7.25) 


i + e\ + 01 

0 if |fc| = 3,4,.. 


Remarks, (i) If we replace 0) by l/0i in p( 1) for an MA(1) time series, we obtain 

the same formula: 
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piX) = 


oi 


-i 


Oi 


i + ep e\ + i ' 


(7.26) 


Furthermore, let 


We have that 


g(x) = 


g'X) 


Because 


1-x 2 

(,x 2 + l ) 2 


g"(x) 


0 if and only if x = ±1. 


2x{x 2 - 3) 


(x 2 + l) 3 5 

we deduce that x — —1 (resp., x = 1) corresponds to a minimum (resp., maximum) of 
the function g(x). Hence, we may conclude that 

4 ^<‘>4 

Conversely, solving for in Equation (7.26), we find that 


0i = 


1 


± 


1 


- 1 


1/2 


2/9(1) LV(i) 

If we denote the two roots by 0i+ and 6^-, then we have that 

° 1+ = 

(ii) In the case of an MA(oc) process, it can be shown that 

oo 

7 (fc) = a 2 y^0*0 i+ | fc |, 

7 = 0 

provided that the coefficients Oi are absolutely summable. That is, if 

oo 

EN < 00 • 

7 = 0 


The reason why time series can be used to model various processes is given in the 
following proposition. It is based on a result, known as Wold’s decomposition theorem , 
which asserts that every weakly stationary (discrete-time) stochastic process can be 
written as the sum of a deterministic process and a moving average process (of infinite 
order). 
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Proposition 7.2.1. Any weakly stationary stochastic process {X n ,n E Z} can be rep¬ 
resented as follows: 

oo 

X n = ^ ^ djCn—j I'm 

i= 1 

where {e n ,n E Z } is a white noise WN(0,a 2 ) process, {n n ,n E Z} is a linearly deter¬ 
ministic process, and 

oo 

Z < °°- 

i=1 

Remarks, (i) If v n = 0, then the stochastic process {X n ,n E Z} is said to be purely 
nondeterministie , which means that all linear deterministic components have been sub¬ 
tracted from the process. Notice that any purely nondeterministie process can be ex¬ 
pressed as a moving average process of infinite order. 

(ii) A process is linearly deterministic if it is perfectly predictable from its own past. 

(iii) The proposition implies that any weakly stationary process has a linear structure. 
That is, it is a linear combination of uncorrelated random variables. 

Definition 7.2.3. The lag or backshift operator is denoted by L and is defined by 

LX n = X n -i for all n. 


In general, we have that 

L k X n = X n _k for all n and for k = 0,1,... . 

Furthermore, the (first) difference operator, A, is given by 

A = l-L, 

so that 

AX n = X n - X n -i for all n. 

Remarks, (i) The operators L and A can be manipulated as if they were algebraic 
quantities. Hence, 

Z\ 2 X n = (1 - Lfx n = (1 - 2 L + L 2 )X n = X n - 2X n _i + X n _ 2 . 

(ii) If we apply the operators L and A to a constant fi, we obtain that 


Lf± — p and Afi = 0. 
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Now, using the lag operator, we can express an AR(p) process as follows: 

/ p \ 


1 — ^ otiL 1 ) X n = e n for all n G Z. 


i= 1 


The time series is weakly stationary if [see Equation (7.17)] all the roots of the polyno¬ 
mial equation 


1 = 0 


(7.27) 


2=1 


lie outside the unit circle. 

Next, an MA(g) time series can also be expressed by means of the lag operator: 


Xn= l + ^)en for all n £ Z. 


2 = 1 


Definition 7.2.4. We say that an MA(q) process is invertible if all the roots of the 
polynomial equation 


i+ °i Li = 0 


(7.28) 




lie outside the unit circle. 


Particular cases, (i) If q = 1, we simply have that 


1 T 6\ L — 0 — 

The root L must be such that \L\ > 1. That is, 

N<i- 

(ii) In the case of an MA(2) process, the two roots of 

1 +9iL + 6 2 L 2 = 0 


L = ~r t 


are 


L+ 


1 

w 2 


-e 1 + {e 2 1 -Ae 2 ) 1/i 


and 




(e 2 1 -Ae 2 ) 1/2 


We know that a complex number z = x + iy lies outside the unit circle if x 2 + y 2 > 1. 
Here, we find that L + and L_ both he outside the unit circle if and only if 
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01 + #2 > —1, 0i — 02 > —1 and — 1 < 02 < 1. 


7.2.3 Autoregressive moving average processes 

To conclude this section, we define autoregressive moving average processes which, as 
their name indicates, generalize both autoregressive and moving average processes. 

Definition 7.2.5. The (zero mean) stochastic process {X n ,n G Z} defined by 

p q 

X n = ^ OiiX n _i + e n + E 0j6 n -j for all n G Z ; 
i=i j=i 

where {e n ,n G Z} is a white noise WN(0,cr 2 ) process (with e n independent of X n - 
X n _ 2 ,...), is called an autoregressive moving average process of order (p,q) and 
is denoted by ARMA(p,q). 

Remarks, (i) An ARMA(p, q) process may be weakly stationary and invertible. It is 
weakly stationary if all the roots of Equation (7.27) he outside the unit circle. That 
is, if the autoregressive part of the process is weakly stationary. Likewise, if the MA(q) 
part of the time series is invertible, then so is the ARMA(p, q) process. 

(ii) A weakly stationary and invertible ARMA(p, q) time series can be expressed as an 
infinite autoregressive process or, equivalently, as a moving average process of infinite 
order. 

(iii) For an ARMA process to have a minimal number of terms, Equations (7.27) and 
(7.28) should have no common roots. 

Example 7.2.4. Suppose that {X n ,n G Z} can be modeled as follows: 

X n = aqX n _i for all n G Z, 

but cannot be observed directly. Rather, we can observe the random variable Y n defined 
by 

Y n = X n + e n for n G Z, 

where {e n ,n G Z} is a WN(0,cr 2 ) process (and e n is independent of Y n _ i, Y n _ 2 , ...). 
Let us define 

Z n =Yn — oqE n _i for n G Z. (7.29) 

We have that 

Z n = X n + e n — aq(X n _i + e n _i). 


250 


7 Time series 


That is, 

%n = X n c^iX n _ i -T e n Qfie n _i 
= e n — aie n -i for n G Z. 

We may assert that {Z n ,n G Z} is an MA( 1 ) process with 0i = —oq. Therefore, we 
deduce from (7.29) that {Y n ,n G Z} is actually a particular ARMA(1,1) time series. 

We now calculate the auto covariance function of a weakly stationary ARM A ( 1 , 1 ) 
process, which is defined by 

X n = a\X n -\ + e n + 0\e n —i for all n G Z. 

We have that [see (7.18)] 

7 ( 0 ) = VAR[X n ] = E [(aq X n -i + e n + 0ie n _i) X n ] 

= aq J E[A n _ 1 X n ] + E[e n X n ] + ^i-^[e n -An] 

= Qi7(l) + E[ e n\ + ^1 (^[ e n-l<^l^n-l] + ^[^l e n-l]) 

= ai 7 (l) + cr 2 + 0 i (aq i£[e 2 ] + 0i<7 2 ) 

= 0 ^ 17 ( 1 ) + cr 2 (l + 0\ol\ + 0 2 ) . (7.30) 

Likewise, 

7 ( 1 ) = COV[X n , X n -i] = E [(aqX n _i + e n + 0ie n _i) X n _i] 

= ai E[Xn_i\ + £’[e n A n _i] + 6>iT;[e n _iA n _i] 

= aq 7 ( 0 ) + 0iE7[e 2 _ 1 ] 

= a l 7 (0) + 6 >icr 2 (7.31) 

and 


7 (/c) — COV[A n , X n -k] — E[{a\X n -i-\-e n -\-0ie n -\) X n -k] 

= aq E [X n —\ A n _^] + T/[6 n A n _/ c ] T 6\E\c n — Wn-fc] 

i =' ai 7 (fc - 1) for k = 2,3,. (7.32) 


Notice that Equations (7.30) and (7.31) enable us to write that 


7(0) 


<7 2 (l + 201 Qq + 0 2 ) 
1 — a\ 


and 7 ( 1 ) 


<7 2 [01 + Oq(l + 01 aq + 0 2 )] 


1 — a\ 


Finally, remember that we are interested in (weakly) stationary time series. If the 
time series {X n , n G Z} is not weakly stationary, we can consider the differenced process 
{F n , n G Z} defined by 
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Y n = AX n = X n — X n -i for all n G Z. 

If the resulting process is weakly stationary, then we can try to model it as an 
ARMA (p,q) process. If it is not, we can difference the [Y n ,n G Z} process, and so 
on. 

In general, we have the following definition. 

Definition 7.2.6. Suppose that the stochastic process {Y n ,n G Z} defined by 

Y n = A d X n =X n + ]T(-1 ) j ( d ) X n _j for all n e Z, 

3 =1 ' J ' 

where d = 1,2,..., is an ARMA(p , q) process. Then, {X n , n G Z} is called an autore¬ 
gressive integrated moving average process and is denoted by ARIMA(p,d,q). 


7.3 Modeling and forecasting 

Let X n be the (random) closing value of a certain stock market index, for example, 
the Dow Jones index, at time n. Suppose that we gathered observations x\, x 2 , ..., 
xn of this closing value over a given period of time and that we would like to model 
the random process {X n ,n = 0 , 1 , 2 ,...} as a particular time series. The first step is 
to check that the data collected constitute a weakly stationary time series. If it is not 
the case, we can difference the data until the assumption of stationarity is reasonable. 
In practice, one way to determine that a particular time series must be differenced is 
to look at the graph of the successive X n s in time. There should be no obvious trend 
pattern. That is, the data should move randomly around a constant mean over time. 
We assume, in the sequel, that the time series considered is indeed weakly stationary. 

To try to identify the orders p and q of an autoregressive moving average process 
that we would like to use as a model for a given dataset, we can look at the sample 
autocorrelation function. We have seen in the preceding section, p. 242, that the (the¬ 
oretical) autocorrelation function p{k) of a weakly stationary AR(p) process decreases 
geometrically (in absolute value) with |fc|, and p(k) is equal to 0 for \k\ > q in the case 
of an MA(q) time series [see Equation (7.23)]. 

To help us determine (approximately) the values of p and q , we define the partial 
autocorrelation function. 

Definition 7.3.1. Let p{k) be the autocorrelation function of a weakly stationary time 
series {X n ,n G Z}. Define the determinants 



1 

P( 1) 

P( 2) 

... p(k-2) p( 1 ) 


p( i) 

1 

P( 1) 

... p(k-3) p( 2 ) 

Am = 

P{ 2) 

P( 1) 

1 

■ ■■ p{k- 4) p(3) 


p(k- 

1) p{k -2) p(k- 3) 

... p{ 1 ) p(k) 
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1 

P( 1 ) 

P{ 2) 

... p(k-2) p(k- 1 ) 


p( 1 ) 

1 

P( 1 ) 

... p{k- 3) p(k - 2) 

£>*,2 = 

P{ 2) 

P( 1 ) 

1 

■ ■■ p(k- 4) p(k - 3) 


p(k- 

1 ) p(k — 2) p(k — 3) 

... p( 1 ) 1 


The partial autocorrelation function, denoted by 4>{k), of the process is given by 

H k ) = fork = 1 , 2 ,... . 

T>k, 2 

Remarks, (i) The determinants in the above definition are given by (for k = 1) 

£>i,i = |p(l)| = p(l) and Di ‘2 =1| = 1 


and (for k = 2) 


£> 2.1 — 


1 p(l) 
p(l) P(2) 


= P( 2 ) - P 2 ( 1) and D 2j 2 = 


1 P(l) 
P(l) 1 


= 1-P 2 (1)- 


That is, first we write the last column of the matrix and then we add the other k — 1 
columns, starting from the first one on the left and moving right (see solved exercise 
no. 8 for £> 3 ^ and £> 3 , 2 )- 

(ii) The <f(k) s are actually obtained by solving the system of linear equations 


p(j) = 0( 1 M 1 - j) + 0(2M2 - 3) + 0(3)/9(3 ~j) + -b <j>(k)p(k - j) 


for j = 1, ..., k [and using the fact that p(—k ) = p(k)\. These equations follow from the 
Yule-Walker equations (7.14). 

(iii) In the case of a weakly stationary AR (p) time series, (j)(k) is the autocorrelation at 
lag fc, after having removed the autocorrelation with an AR(fc — 1) process. That is, it 
measures the correlation that is not explained by an AR (k — 1) process. 

(iv) If the time series is a weakly stationary AR(p) model, we find that (j){k) = 0 for 
k = p + l,p + 2 ,..., whereas cj)(k) decreases (approximately) geometrically, in absolute 
value, in the case of an MA(g) [and an ARMA(p, q)\ model (see Exercise no. 18, p. 265). 
Notice that the (j)(k ) function behaves in the exact opposite way as p(k) does for AR (p) 
and MA(^) processes. 

(v) We deduce from the preceding remark that if we want to use an AR(p) model for a 
given dataset, then the sample partial autocorrelations should be approximately equal 
to 0 from k = p+ 1. In practice, the autocorrelations p{k) must first be estimated from 
the data before we can compute the sample (j)(k) s. 

Particular cases, (i) If {X n ,n E Z} is a weakly stationary AR( 1 ) stochastic process, 
then we only have to calculate 
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m =^Y= p( 1) (7.33) 

and we can set cj)(k ) = 0 for |fc| «= 2, 3,... . 

Remark. Notice that 0(1) will always be equal to p( 1). 

(ii) When the time series considered is a weakly stationary AR(2) process, we have that 

0(1) = ^=P(1) and m = (7-34) 

[and (j){k) = 0 for \k\ = 3,4,...]. 

(iii) In the case of an invertible MA(1) time series, it can be shown that 


m 


(-0i) k 


i + e\ + • • • + of 


1 _ Q^k+l) 


for k = 1, 2,... 


(7.35) 


Once a particular ARMA(p, q) process has been identified as being reasonable for 
the data collected, we must estimate the parameters aq,..., a p and ... ,9 q in the 
model. Finally, there exist many statistical tests that enable us to assess the quality of 
the fit of the model to the data. We do not get into these statistical questions in this 
book. However, we give the formula used to estimate the autocorrelation function p(k). 

The original time series {T n ,n G Z} is assumed to have a constant mean p. To 
estimate /q we collect some data, 2 / 1 , 7 / 2 , •••, Vni and we calculate their arithmetic 
mean. That is, we set 



n= 1 


Remark. The observed data 2 / 1 , V2, • ••, Un are particular values taken by random 
variables that constitute a random sample of the variable of interest, say Y. The quantity 
(1 defined above is a point estimate of E[Y n \ = p. 

Next, we set 

%n = yn- P for n = 3 1, 2,..., N. 

The corresponding process {X n ,n G Z} is supposed to have zero mean and be weakly 
stationary. The point estimate of the (constant) variance of the X n s is 



n=1 


Remark. Remember that the variance o\ of the X n s is not the same as the variance a 2 
of the white noise process. 
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More generally, to estimate the covariance of X n and X n -k, we use 

1 N 

^ ^ x n x n —k- 

n=k-\-1 

Then, the point estimate of p(k) is given by 

m = W) forfc = 1,2 ’--- • 

Example 7.3.1. Suppose that Xq = 0 and that 

X n — 0.5A" n „i T e n for n — 1,2,..., 

where the e n s are i.i.d. random variables having a U(—1,1) distribution (and e n is 
independent of X n _i,... , Ah). Therefore, {e n ,n = 1 , 2 ,...} is an IID(0,1/3) noise and 
{X n ,ns 0,1,...} is an AR( 1 ) process. 

Using a statistical software package, we generated 40 (independent) observations of 
a U(—1,1) random variable and we calculated the value x n of X n , for n = 1 ,... ,40. 
The results are the following: 


n 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 


-0.41 

0.70 

0.76 

0.38 

0.60 

- 0.20 0.22 

-0.15 

-0.39 

-0.76 

n 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 


0.59 

-0.40 

-0.05 

0.39 

0.42 

0.38 

0.47 

-0.04 

-0.23 

- 0.11 

n 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 


-0.84 

-0.49 

-1.23 

-0.80 

-0.92 

-0.61 0.47 

0.76 

-0.44 

-0.90 

n 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 


0.54 

- 0.12 

-0.76 

0.32 

-0.36 

0.60 

0.75 

1.00 

0.01 

-0.73 


Although the number of data points is not very large, we now show whether we 
are able to determine that the x n s are observations of an AR( 1 ) process by proceeding 
as suggested above. First, if we look at the scat ter d iagram of the x n s against n, we 
notice no obvious trend in the data (see Figure fM Therefore, we may assume that 
the underlying stochastic process is weakly stationary. 

Next, the mean of the data points is equal to —0.0395. Moreover, the point estimate 
of the variance o\ is approximately 0.337. 

Remark. The number of observations is not large enough to obtain very accurate point 
estimates of the mean (which is actually equal to 0 ) and the variance of the X n s, which 
is [see (7.10)] 
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Fig. 7.2. Scatter diagram of the data in Example 7.3.1. 

VAW - = 0.4. 

Finally, we calculate the point estimate of 0(fc), for k = 1, 2, 3. We find that 

0(1) ~ 0.328, 0(2) ~ 0.010 and 0(3) ~ 0.057. 

We notice that the partial autocorrelations are close to 0 for k = 2,3, from which we 
deduce that an AR(1) model could indeed be reasonable for the data. In fact, some 
values of 0(£0 are larger (in absolute value) as k increases. However, the larger k is, the 
fewer data points are available to estimate the corresponding partial autocorrelation, so 
that the point estimates are less and less reliable, especially when n is small. 

We now turn to the problem of using the data to forecast future values of the time 
series. 

Let {X n , n E Z} be a time series. Suppose that we would like to forecast the value of 
X n+ j, where j E {1,2,...}, based on the observed random variables X n , X n _i,.... Let 
H n denote the set {X n ,X n _i,...}. That is, H n is the history of the stochastic process 
up to time n. 

Next, denote by g(X n j r j \ H n ) the predictor of X n+J -, given H n . To determine the 
best predictor, we need a criterion. One which is widely used is the following: we look 
for the function g that minimizes the mean-square error 

MSE (g) := E[{X n+j - g(X n+j | H n )} 2 }. 

It can be shown that (if the mathematical expectation exists) the optimal predictor is 
actually the conditional expectation of X n+ j , given H n \ 

g*(X n+j \H n )=E[X n+j \H n ]. 

In practice, we cannot use an infinite number of random variables to forecast X n+ j. 
Therefore, the set H n could, for example, be {X n , X n _i,..., Xi}. Moreover, sometimes 
we look for a function g(X n +j \ H n ) of the form 
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g{X n +j | H n ) — Co H - E ^iX n —i-\-l • (7.36) 

i=i 

The function g(X n +j \ H n ) having coefficients £$, for i = 0, ...,n, which yield the 
smallest mean-square error is called the best linear predictor of X n +j. 

Now, we consider the case when we want to forecast X n +j based on X n alone. 
When the stochastic process {X n ,n G Z} is Markovian , and X n is known, the values of 
X n _i, X n _ 2 ,... are actually unnecessary to predict X n+ j. 

Proposition 7.3.1. Let {X n ,n G Z} be a Markovian process. The optimal predictor of 
X n +j, based on the history of the process up to time n, is a function of X n alone. 

Furthermore, if {X n ,n G Z} is a stationary Gaussian process, then the optimal 
predictor and the best linear predictor of X n+ j , based on X n: coincide, which follows 
from the next proposition. 

Proposition 7.3.2. Let {X n ,n G Z } be a stationary Gaussian process. The expected 
value of X n +j, given that X n — x n , is of the form 

| = %n] = CLX n T 6, (7.37) 

where the constants a and b are given by 

a — p(j) and b = g[l - p(j)}. (7.38) 

Remarks, (i) In general, if the random vector (Xi,^) has a bivariate normal distribu¬ 
tion, then E[X 2 | X\ — x\\ is given in (7.3). 

(ii) The above result can be generalized to the case when we calculate the conditional 
expectation E[X n+j \ X n , X n _i,..., Xi\. 

(iii) Remember (see Proposition 7.1.1) that, in the case of a Gaussian process, if it is 
weakly stationary, then it is also strictly stationary. 

(iv) We deduce from (7.3) that the conditional variance VAR [X n+ j \ X n = x n ] is 

VAR[X n+i \X n = x n }=a 2 x [l- p 2 (j)}, (7.39) 

where a\ = VAR[X n ] for all n. Hence, the closer to one (in absolute value) the correla¬ 
tion coefficient p(j) is, the more accurate is the forecast of X n +j (based on X n ), which 
is logical. 

Next, suppose that we want to forecast the value, V n+ i, of a weakly stationary and 
invertible ARMA(p, q) process at time n + 1, given X n , V n _i,.... We have that 

p q 

A"n+1 = ^ ^ T e n+1 T ^ ^ @j ^n+1— j • 

i=l j=l 
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Hence, we can write that 

v q 

E[Xn +1 I H n \ = OLiXn+l-i + Oj€n+l-j- 

i =1 J = 1 

Indeed, given the history of the process up to time n, the random variables X n +\ and 
e n+ i_j are known quantities for G {1, 2,..Moreover, by independence, 


^[ e n+1 | H n \ — ^[ e n+l] — 0. 


More generally, to forecast X n+m , where m G {1,2,...}, we must calculate the 
conditional expectation 

v q 

E[Xn+m I H n ] = '^^aiE[X n+rn _ i | iL n ] + 0j Elcn+m-j | iL n ], 
z=l j=l 


in which 

and 


E[X n -\-m—i | -^n] — -^n+m —i if ^ ^ ^ 

E^+r-J I ^ = { e n + m -j if m < } 


Particular case. Let p = q = 1, so that 

-^-"n+ra — 1 T ^n+m H“ ^lTn+ra—1* 

Then, 


and 

It follows that 


and 


^[^n+i | H n \ — Oi\X n + 6\e n 
E[X n+rn | iJ n ] = ai£ , [X n+m _i | iL n ] for m = 2, 3,... . 

E[X n -\-2 | L/n] = L/[JC n _)_i | H n ] = oq {rr \X n + $ie n } 

= aqX n + aq#ie n 

-^[^n+3 | = ^l^[^n+2 | -^n] = {oq^n T Qq^lTn} 

= oq X n + oq^iCn? 


and so on. That is, we have that 
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E[X n+m | H n ] = a?X n + a?- 1 6 ic n for m = 2,3,... . 


Example 7.3.2. Suppose that {X n ,n E Z} is a weakly stationary AR(1) time series. 
Therefore, 

X n -\-m = <^1 ATn+ra-1 T 6 n _)_ m , 

where 0 < |a+| < 1. Proceeding as above, we find that 

E[X n+rn | H n \ = a™X n for m = 1, 2,3,... . 

This result can actually be deduced directly from the formula 


X n _|_ m — OL\X n - |_ m _i T e n _|_ m — Q+ (c^l A^n+m—2 T l) T € n _|_ m 

m —1 

— • — oy X n T ^ ^ ^]Tn+m—z 

z=0 

(because E[e n+m _^ | H n \ = 0 if m > i). 

We also deduce from the previous formula that the forecasting error is given by 


X 


n+m 


E[X, 


n+m 


Hr, 


a?X n + 


m —1 

i =0 


m—1 




^ ^ C^l^n+m—z- 
z=0 


It follows that 


m—1 m—1 

VAR [X n+m -E[X n+m | ff n ]] u = Y, a? < VAR[e n+ro _ i ] = ^ +V 2 . 

z=0 z=0 

Example 7.3.3. In the case of an MA(g) process, we can write that 

q 

Xn-\-m = ^n+m T E QiCn+m-i for all n E Z and for ra = 1, 2,... . 

Z—1 


Hence, we deduce that 


E[X n+rn | i7 n ] — 


Ei=m S i e n+m-i if TO = 1, . . . , q, 

0 if m = q + 1, q + 2 ,... . 


When q = 1, we simply have that 


E[X n+rn | if n ] 


#ie n if m = 1, 

0 if m = 2, 3,..., 


so that the variance of the forecasting error is 
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VAR \X n j rTn E[X n -)_ m | i7 n ]] — VAR[e n+m $i€n+m —1 $i€n+ra— 1 ] 

= VAR[e„ +?n ] = a 2 if m = 1 


and 


VAR [X n+rn — E[X n+rn | H n ]\ — VAR[e n+m + 6 ie n+m _i] 

u =‘ (l + ^)cr 2 if ra = 2,3,. 


Observe that the variance of the forecasting error is the same for any m E {2,3,...}, 
whereas in the case of an AR(1) process this variance increases with m (see the previous 
example). 

To conclude, we consider the case when we want to forecast the value of a zero mean 
weakly stationary time series at time n + ra, for m = 1 , 2 ,..., based on a linear function 
of X n , V n _i,..., X\. That is, we look for the function [see (7.36)] 


9{Xn+m | 77 n ) 


n 

^ ^ £>iX n —i +1 
i=1 


that minimizes the mean-square error 


MSE(fi,...,f n ) :=E 


X n -\-m ^ ^ 


i +1 


i= 1 


2 ' 


Remark. The fact that E[X n \ = 0 implies that £o in (7.36) is equal to 0. 

To obtain the values of the parameters & that minimize the function MSE, we 
differentiate this function with respect to & and we set the derivative equal to 0, for 
i = 1,..., n. We find that the &s must satisfy the system of linear equations 

n 

— r) = 7(771 — 1 + r) for r = 1,..., n. 

i =1 

We can easily obtain an explicit solution when m = 1, that is, when the aim is to 
forecast the next value of the time series. We then have that 

n 

£i 7 (i — r) = 7 (r) for r = 1 ,..., n, 

i=l 

which can be rewritten as follows: 

7 ( 1 - 1 ) 7 ( 1 - 2 ) ... 7 ( 1 -n) 

7 ( 2 - 1 ) 7 ( 2 - 2 ) ... 7 (2 -n) 

7 (n — 1 ) 7 (n — 2 ) ... 7 (n — n) 
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“n^~n — Gm 

where E n := (£ 1 ,.. .,£„), G n := ( 7 ( 1 ),... ,7(77)) and 


r n := 


7 ( 0 ) 7 ( 1 ) ...7(71-1) 

7 ( 1 ) 7 ( 0 ) ... 7 ( 77 - 2 ) 


7(77 — 1) 7(77 — 2 ) ... 7 ( 0 ) 

Assuming that the matrix r n is invertible (or nonsingular), we can write that 


—n — G n r n 


-1 


and 


g(X n+1 \H n ) = E n (X n ,...,X 1 f. 


Remark. It can be shown that the (last) coefficient £ n is in fact equal to the partial 
autocorrelation (j)(n). 

Finally, we find that the mean-square forecasting error is given by 


E 


(W+i - g(X n+1 I H n )} 2 = 7 ( 0 ) - G n r~ x G T n . 


Example 7.3.4. To forecast the value of X n +i for a (zero mean) weakly stationary 
AR( 1 ) process, based on a single observation, X n , of the process, we calculate 


g{X n+1 I X n ) = ZiXn = 7(1 )r^X n = 


7(1) 

7(0) 


X n . 


Making use of (7.11), we can write that 

kl(Xn -(-1 | X n ) = (^1 A n , 

which is the same predictor as E[X n+ \ \ H n \ obtained in Example 7.3.2. 

Example 7.3.5. For a weakly stationary AR( 2 ) time series (having zero mean), the 
best linear predictor of X n+ i, based on X n , is also 

g(X n+1 | X n ) = = p(l)X n . 

To forecast the value of X n+ i, based on the random variables X n and X n _i, we can 
show that g(X n+1 | X n ,X n _i) is given by 


g(X n +l | X n , X n _i) — OL\X n + OL2X n -\. 
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Actually, we find that 

g(X n +1 | X n , X n -i ,..., X n _i) = <aiX n + 02 A n _i 
for any / G {1,..., n — 1}. 

Remark. In general, if {X n ,n G Z} is a weakly stationary AR(p) time series with 
E[X n \ m 0, then 

g(x n +i | x n ,x n -i, ..., Xi) = oi A n + c^2A^n_i + • • • + ®"pX n _ v+ 1 

for any n G {p,p + 1,.. 
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Solved exercises 


Question no. 1 

Let {X n , n = 0 , 1 ,...} be an IID(0, a 2 ) noise. We define 

■\r X n ~\~ X n —i 

Yn = --- for n = 1 , 2 ,... . 

Is the stochastic process {Y n ,nm 1,2,...} weakly stationary? 

Question no. 2 

Suppose that the random vector (Xi, X 2 ) has a bivariate normal distribution and 
that Ai ~ N(0,1) and X 2 ^ N(0,1) are independent random variables. Use Proposi¬ 
tion 4.3.1 to find the joint probability density function of the transformation 


Yi — a\X\ + 61 , 

I 2 = a^X^ + ^2? 

where ai ^ 0 and bi G M, for i = 1 , 2 . 

Question no. 3 

Calculate the variance of a weakly stationary AR(3) process {X n ,n G= Z} (with 

E[A n ]«0). 

Question no. 4 

Suppose that X 0 = 0 and let 

Xn — OL\X n —\ e n for n — 1 , 2 ,..., 
where {e n , n = 1 , 2 ,...} is a GWN(0, a 2 ) process. We define 
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Y n = e Xn for n = 1, 2,... . 

Calculate the expected value of Y n . 

Indications, (i) It can be shown that if X and Y are independent random variables, 
then so are g(X) and h(Y) for any functions g and h. 

(ii) The moment-generating function of a random variable X ~ N(/i, a 2 ) is given by 
M x (t) := E[e tX ] = exp j/ii + yt 2 j • 


Question no. 5 

Calculate the autocorrelation function p(fc), for k = 1, 2, 3, of an MA(3) time series. 

Question no. 6 

What are the possible values of the function p(2) for an MA(2) process if (a) 6\ = 1? 
(b) (9i e m 

Question no. 7 

Calculate, in terms of 7 ( 1 ), the variance of a (zero mean) weakly stationary 
ARMA(1,2) process. 


Question no. 8 

Check the formula (7.35) for an invertible MA(1) time series, for k = 1,2,3. 


Question no. 9 

Consider the following data, denoted by 2 / 1 , 3 / 2 ? • • • ? 2/40 : 


n 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

yn 

-0.41 

0.70 

0.86 

0.21 

0.41 

-0.30 0.07 

-0.10 

-0.44 

-0.72 

n 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

yn 

0.69 

-0.22 

-0.20 

0.50 

0.43 

0.28 

0.37 

-0.13 

-0.35 

-0.11 

n 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

yn 

-0.78 

-0.46 

-1.03 

-0.68 

-0.61 

-0.41 0.70 

0.92 

-0.56 

-1.09 

n 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

yn 

0.65 

0.11 

-0.90 

0.35 

-0.17 

0.52 

0.84 

0.85 

-0.18 

-0.98 


Define 

x n = yn-y for n = 1,... ,40, 

where y is the arithmetic mean of the data points. Calculate cr\ and the sample partial 
autocorrelation />(&) of the centered data, for k = 1,2,3. Could an MA(1) time series 
with 0i = 1/2 serve as a model for 27 , x 2 > • • • ? ^40 if eo = 0 and e n ~ U(—1,1), for 
n = 1,2,..., 40? Justify. 
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Question no. 10 

(a) Use Equation (7.3) to prove Proposition 7.3.2. (b) Calculate the mathematical ex¬ 
pectation E[X 2 +j | X n = x n ]. 


Exercises 


Question no. 1 

The discrete-time stochastic process {X n ,n = 0,1,...} is an IID(0, a 2 ) noise. Is the 
process {T n , n = 1,2,...} defined by 

Y n = X n X n -i for n = 1, 2,... 


weakly stationary? Justify. 

Question no. 2 

Suppose that Xo ~ N(0,1) and that 

X n = x 0 + Y n for n = 1, 2,..., 

where Y n = ±1 with probability 1/2 and is independent of X$. Is the stochastic process 
|X n , n = 0,1,...} a Gaussian process? Justify. 

Question no. 3 

Let {N(t),t > 0} be a Poisson process with rate A > 0 and define 
X n = (-l) N{n) -M for n = 0, 1, ..., 

where M is a random variable that is equal to 1 or —1 with probability 1/2. Moreover, 
M is independent of N(n). Calculate E[X n \ and COV[X n , X n+m ], for n, m E {0,1,...}. 
Is the stochastic process {X n , n = 0,1,...} weakly stationary? 

Indication. We can show that E [(—l)^( n )] = e _2An , for n = 0,1,... . 

Question no. 4 

A standard Brownian motion (see p. 235) is a continuous-time Gaussian stochastic 
process {W(t),t > 0} such that W{ 0) = 0, E[W(t)] = 0, and COV[VE(,s), W{t)) = 
min{,s, t} for all s,t > 0. We consider the stochastic process {X(t),t > 0} for which 

W(t) =t 1/2 X(\nt) for t >1. 

Is |X(t),t > 0} a Gaussian process? Is it (strictly) stationary? Justify. 

Remark. The stochastic process {X(t),t > 0} is actually a particular Ornstein- 
Uhlenbeck process. 

Question no. 5 

Let { X n ,n E Z} and |T n ,n E Z} be two (zero mean) weakly stationary AR(1) 
processes defined by 
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X n = ail n -i + e n and Y n = $\Y n -\ + r\ n Vn G Z, 
where {e n ,n G Z} and {r/ n , n G Z} are independent WN(0,cr 2 ) processes. We define 

Z n = X n + W Vn G Z. 

Calculate P[Z n ] and COV[Z n , Z n+m ], for n, m G Z. Is the stochastic process {Z n , n G Z} 
weakly stationary? 

Question no. 6 

Consider the AR(2) process { X n , n = 0,1,...} given by Xo = 0, X\ = 0, and 
X n — X n —i T X n —2 T for ^ — 2,3,... . 

Suppose that P[e n = 1] = P[e n = —1] = 1/2, for all n. Calculate the probability mass 
function of X 4 . 

Question no. 7 

Let {X n ,n = 0,1,...} and {F n ,n = 0,1,...} be independent random walks (see 
p. 239) with po = 1/2, and define 

X n + w 

Z n = --- for n = 0,1,... . 

Calculate CO V[Z n , Z n+m ], for n,m G {0,1,...}. Is the stochastic process {Z n ,n = 
0,1,...} a random walk? Justify. 

Question no. 8 

Suppose that Xo = 0 and 

X n = cx\X n _\ T 6 n for n — 1,2,..., 

where the independent random variables e n are also independent of A n _i,..., Xo and 
are such that P[e n = 1] = P[e n = —1] = 1/2, for n = 1, 2,... . Let Y n = X 2 , for all n. 
Calculate (a) VAR[F 2 ] and (b) E[Y 2 k ], for k G {1, 2,...}. 

Question no. 9 

Let {X n ,n G Z} be an MA(1) time series with 0 1 = 1/2. Suppose that P[e n = 1] = 
P[e n = —1] = 1/2, for all n G Z. Calculate P[X n | X n > 0], for n G Z. 

Question no. 10 

Consider the time series {X n ,n G Z} and {Y n ,n G Z} defined by 

X n = C n T Ql^n— 1 

and 

— e n + ^l^n-l + ^2^n-2 

for all n G Z. What is the correlation coefficient px n ,Y n , for n G Z? 
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Question no. 11 

Find, in terms of 7 (fc), the variance of a (zero mean) weakly stationary ARM A ( 2 , 1 ) 
process. 

Question no. 12 

Is the MA(3) time series {X n ,n G Z} defined by 

X n = e n + #ie n _i -f— e n - 2 + -e n -3 for all n G Z 

invertible if (a) #1 = |? (b) 6 \ = |? Justify. 

Question no. 13 

Let 

-An = e n — e n _i for all n G Z, 

where {e n ,n G Z} is a WN(0,cr 2 ) process, so that {X n ,n G Z} is an MA(1) time series 
for which 0\ = — 1 . Calculate VAR[X n | X n 7 ^ 0] if we assume that P[e n = 1 ] = P[e n = 
— 1] = 1/2, for all n G Z. 

Question no. 14 

Let {X n ,n G Z} be an MA(1) process and define 

Y n = Xl for all n G Z. 


Calculate the mean and the variance of the stochastic process {Y n ,n G Z} if e n ~ 
N(0, a 2 ), for all n. 

Indication. The characteristic function of Z ~ N(0,1) is Cz(u) — e -u;2 / 2 . 

Question no. 15 

Calculate VAR[A n+m | H n \ for m = 1, 2 ,... if {X n , n G Z} is an MA( 1 ) process and 
H n = {X n ,X n _i,...}. Assume that the e n s are independent random variables. 

Question no. 16 

Let {X n ,n G Z} be an MA(1) time series, (a) What is the best linear predictor of 
X n+ i, based on X n . (b) Is this predictor similar to E[X n+ 1 | H n \ (see Example 7.3.3)? 

Question no. 17 

Suppose that {X n ,n G Z} is a weakly stationary ARMA(1,1) process. Calculate its 
partial autocorrelation function </>(&), for k = 1 , 2 , if 07 = 6 \ = 1 / 2 . 


Question no. 18 

Consider an invertible MA(1) time series with 6 1 > 0. (a) Show that its partial au¬ 
tocorrelation function cj)(k) decreases (approximately) geometrically, in absolute value. 
That is, show that 


0(fc + l) 


c for k large enough, 


where c G (0,1) is a constant, (b) Show that 
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| 0 (fc)| < 61 for k = 1 , 2 ,... . 


Hint. See Equation (7.35). 

Question no. 19 

Let {e n ,n E Z} be a GWN(0,1) process and define 

y ^Tl bI” ^ Tl 1 w _ r-n 

X n = --- V n G Z. 

Calculate the mathematical expectation E[X n+ j \ X n = x n \, for j = 1 , 2 ,..., and show 
that it is of the form given in Equations (7.37) and (7.38). 

Question no. 20 

Suppose that we want to forecast the value of X n+ i, based on a linear function 
of X n and X n _i, for a (zero mean) weakly stationary AR( 2 ) time series. What is the 
corresponding mean-square forecasting error [in terms of 7 ( 0 )] if aq = 1/2 and a 2 = 1/4? 

Hint See Example 7.3.5 and Equation (7.13). 

Multiple choice questions 


Question no. 1 

A Bernoulli process is a discrete-time (and discrete-state) stochastic process {X n , n = 
0,1,...}, where the X n s are i.i.d. Bernoulli random variables with parameter p G (0,1). 
Calculate COV[X n , X n+m ], for n, m G {0,1,2,...}. 

(a) 0 Vn, m (b) 0 if m = 0 ; p( 1 — p) if m 7 ^ 0 

(c) 0 if m ^ 0; p( 1 — p) if m = 0 (d) p( 1 — p) Vn, m 

(e) p 2 if m ^ 0 ; p if m — 0 

Question no. 2 

Let (Xl,X 2 ,X 3 ) have a trivariate normal distribution with m = (0,1, —1) and 


C = 


1 0 -2 
0 2 0 
-2 0 4 


Calculate E[X 1 -h X 2 | X 3 = 0]. 

(a) 0 (b) 1/2 (c) 1 (d) 3/2 (e) 2 

Question no. 3 

A weakly stationary AR( 1 ) process {X n , n — 0,1,...} is defined by Xq = 0 and 
X n — —X n —i T e n for n = 1, 2,..., 


where {e n , n = 1 , 2 ,...} is a GWN( 0 , 1 ) process. Calculate E[X 2 \ X 2 > 0 ]. 
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(a)|V5 (b)V5 (c (d)^J (e)/f 

Question no. 4 

Suppose that X G = 0 and define the random walk (see p. 239) 

X n = X n _i + e n for n = 1, 2,..., 

where {e n , n = 1, 2,...} is an i.i.d. noise process such that e n takes on the value 1 or — 1 
with probability 1/2, for n = 1,2,... . Calculate the correlation coefficient of X n and 
AT n+m , for n,m E {1,2,.. 

(a)(d^) 1/2 (b)(^) 1/2 ( C )(^F7. (d) n+m (e)^ 

Question no. 5 

Suppose that 

-A n = + e n + - e n -i Vn E Z. 

That is, {X n ,n E Z} is an ARMA(1,1) process with a\ = 0\ = 1/2. Calculate the 

correlation coefficient px n ,x n _ 2 - 

(a) 2/7 (b) 5/14 (c) 1/2 (d) 9/14 (e) 5/7 

Question no. 6 

Calculate E[X n \ X n _i = 2] for an MA(1) time series with 0 1 = 1 and for which 
P[e n = 1] = P[e n = —1] = 1/2, where the e n s are independent random variables. 

(a) 0 (b) 1 (c) 3/2 (d) 2 (e) 3 

Question no. 7 

What are the possible values of the autocorrelation p( 1) for an ARMA(1,1) time 
series with a\ = 1/2? 

(a) [-1,1] (b) [-3/4,3/4] (c) [-1/2,1/2] (d) [-1/4,1/4] 

(e) [-1/4,3/4] 

Question no. 8 

Let {X n ,n E Z} be a stationary Gaussian process with p = 3 and p( 1) = 1/2. 
Calculate the mathematical expectation E[X n +\X n \ X n = —2]. 

(a) -3 (b) -1 (c) 0 (d) 1 (e) 3 

Question no. 9 

Suppose that the time series {X n , n E Z} is an MA(1) process with 0 1 = 1 and that 
the independent random variables e n are such that P[e n = 1] = P[e n = —1] = 1/2, for 
all n E Z. Calculate E[X n +i \ X^ = 4]. 

(a) -1 (b) -1/2 (c) 0 (d) 1/2 (e) 1 

Question no. 10 

We want to calculate the best linear predictor g(X n +1 | X n ,X n _i) for an MA(1) 
time series with 6 1 = —1/2. What is the value of the coefficient £ 2 ? 

(a) -2/5 (b) -4/21 (c) 0 (d) 4/21 (e) 2/5 
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lim^zo f(x) 
f(x) 
liuM^ f(x) 
9° f 
fix o) 

fi x o) 
f ix 0 ) 

Cx{x>) 

Fi") 

M x (t) 

{ a n}^Ll 

E OO 

n= 1 

E n 

k =1 a k 

R 

f*g 

Gx(z) 


Chapter 1 

limit of the function f{pc) as x tends to xq 
right-hand limit of f{pc) as x decreases to xq 
left-hand limit of f(x) as x increases to xo 
composition of the functions g and / 
derivative of f(x) at x$ 
right-hand derivative of f{pc) at xq 
left-hand derivative of f(x) at xq 
characteristic function of X 
Fourier transform 
moment-generating function of X 
infinite sequence 
infinite series 

nth partial sum of the series 
radius of convergence of the series 
convolution of the functions / and g 
generating function of X 

Chapter 2 


(2 

A, B, C 
A n B = 0 
A! 

P[A] 

P[A | B] 


sample space 
events 

incompatible or mutually exclusive events 
complement of event A 
probability of event A 

conditional probability of event A , given that B occurred 



270 A List of symbols and abbreviations 


Px 

Fx 

fx 

B (n,p) 

Geo (p) 

NB (r,p) 

Chapter 3 

probability (mass) function of X 
distribution function of X 
(probability) density function of X 
binomial distribution with parameters n and p 
geometric distribution with parameter p 
negative binomial distribution 
with parameters r and p 

Hyp (N, n, d) 

hypergeometric distribution 
with parameters TV, n, and d 

Poi(A) 

N(/U,cr 2 ) 

Poisson distribution with parameter A 
normal or Gaussian distribution 
with parameters fi and a 2 

Z 

<p(z) 

N(0,1) 

*(*) 

Q(z) 

r 

G(a, A) 

W(A, (3) 

Be(a, (3) 

LN( M ,a 2 ) 

E[g{X)} 

Hx or E[X\ 
x m or x 

x p 

Xp 

A 

/4 

Pk 

Pi 

P2 

random variable having a N(0,1) distribution 
density function of the N(0,1) distribution 
standard or unit normal distribution 
distribution function of the N(0,1) distribution 

1 - 0(z) 

gamma function 

gamma distribution with parameters a and A 

Weibull distribution with parameters A and /3 
beta distribution with parameters a and [3 
lognormal distribution with parameters p and a 2 
mathematical expectation of a function g of X 
mean or expected value of X 
median of X 

100(1 — p)th quantile of X 

100(1 — p)th percentile (if lOOp is an integer) 
variance of X 

/cth-order moment about the origin or noncentral moment 
/cth-order moment about the mean or central moment 
skewness (coefficient) 
kurtosis (coefficient) 
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Px.Y 

Fx,y 

fx,Y 

fx{x I Ay) 
E[Y | X = x] 
101 
X®X 
COV[X, Y] 
CORR[X, Y] 
i.i.d. 

CLT 


R(x) 

MTTF 

MTBF 

MTTR 

r(t) 

IFR 

DFR 

FR(t 1 ,t 2 ) 

AFR(tyt 2 ) 

H(x i,...,x n ) 
x := (xi,... ,5 

X > y 


Chapter 4 


joint probability function of (X,Y) 
joint distribution function of (X,Y) 
joint density function of (X,Y) 
conditional density function of X, given Ay 
conditional expectation of Y, given that X = x 
convolution product of X with itself 
convolution sum of X with itself 
covariance of X and Y 
or px,Y correlation coefficient of X and Y 

independent and identically distributed 
central limit theorem 
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reliability or survival function 
Mean Time To Failure 
Mean Time Between Failures 
Mean Time To Repair 
failure or hazard rate function 
Increasing Failure Rate 
Decreasing Failure Rate 

interval failure rate of a system in the interval (£ 1 ,^ 2 ] 
average failure rate of a system over an interval 
structure function of the system 
n ) state vector of the system 
Xi>yi,i = l,...,n, 

for the vectors x = (xi,..., x n ) and y «= (2/1,..., y n ) 
Xi > yi, i = 1,..., n, and Xi > yi for at least one i 
for the vectors x = (xi, _, x n ) and y = (2/1,...., y n ) 


x > y 
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{X(t),teT} 

Pi,j 

stochastic or random process 

probability that the continuous-time Markov chain 
{X(£),£ > 0}, when it leaves state z, goes to state j 

Xi,i — 0. 1,... 
Mil 7 2, . • . 

n j 

M/M/s 

x(t) 

N 

Nq 

N s 

T 

Q 

S 

A a 

A e 

D(t) 

N 

N s 

P 

M/M/l/c 

FIFO 

7Tb 

M/G/s/s 

birth or arrival rates of a birth and death process 

death or departure rates of a birth and death process 

limiting probability that the process will be in state j 

queueing system with s servers 

number of customers in the system at time t 

average number of customers in the system in equilibrium 

average number of customers who are waiting in line 

average number of customers being served 

average time that a customer spends in the system 

average waiting time of an arbitrary customer 

average service time of an arbitrary customer 

average arrival rate 

average entering rate of customers into the system 

number of departures from the queueing system in [0, t\ 

number of customers in the system in equilibrium 

number of customers being served 

traffic intensity or utilization rate 

queueing system with one server and finite capacity c 

First In, First Out 

probability that all servers are busy 

Erlang loss system 
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7(0 
p( 0 
ACF 
ACVF 
IID(0, a 2 ) 
WN(0, a 2 ) 
GWN(0, a 2 ) 

C 

T 

N(m, C) 
erf 

AR (p) 

MA(g) 

L 

A 

ARMA(p, q) 
ARIMA(p, d, q) 

H-) 

o 

H n 

9^-^-n+j | 

MSE 


Chapter 7 


(auto) covariance function of a stationary process 

autocorrelation function of a stationary process 

autocorrelation function 

auto covariance function 

i.i.d. noise with zero mean 

white noise process 

Gaussian white noise 

covariance matrix 

transpose of a vector or a matrix 

multinormal distribution 

error function 

autoregressive process of order p 
moving average process of order q 
lag or backshift operator 
difference operator 

autoregressive moving average process of order (p, q) 

autoregressive integrated moving average process 

partial autocorrelation function 

point estimate of a parameter 6 

history of the stochastic process up to time n 

predictor of X n+ j , given H n 

mean-square error 
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B.2 Distribution function of the Poisson distribution 
B.3 Values of the function @(z) 

B.4 Values of the function Q _1 (p) for some values of p 
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Table B.l. Distribution function of the binomial distribution 




P 

0.05 0.10 0.20 0.25 0.40 0.50 

n 

X 


2 

0 

0.9025 0.8100 0.6400 0.5625 0.3600 0.2500 


1 

0.9975 0.9900 0.9600 0.9375 0.8400 0.7500 

3 

0 

0.8574 0.7290 0.5120 0.4219 0.2160 0.1250 


1 

0.9927 0.9720 0.8960 0.8438 0.6480 0.5000 


2 

0.9999 0.9990 0.9920 0.9844 0.9360 0.8750 

4 

0 

0.8145 0.6561 0.4096 0.3164 0.1296 0.0625 


1 

0.9860 0.9477 0.8192 0.7383 0.4752 0.3125 


2 

0.9995 0.9963 0.9728 0.9493 0.8208 0.6875 


3 

1.0000 0.9999 0.9984 0.9961 0.9744 0.9375 

5 

0 

0.7738 0.5905 0.3277 0.2373 0.0778 0.0313 


1 

0.9774 0.9185 0.7373 0.6328 0.3370 0.1875 


2 

0.9988 0.9914 0.9421 0.8965 0.6826 0.5000 


3 

1.0000 0.9995 0.9933 0.9844 0.9130 0.8125 


4 

1.0000 1.0000 0.9997 0.9990 0.9898 0.9688 

10 

0 

0.5987 0.3487 0.1074 0.0563 0.0060 0.0010 


1 

0.9139 0.7361 0.3758 0.2440 0.0464 0.0107 


2 

0.9885 0.9298 0.6778 0.5256 0.1673 0.0547 


3 

0.9990 0.9872 0.8791 0.7759 0.3823 0.1719 


4 

0.9999 0.9984 0.9672 0.9219 0.6331 0.3770 


5 

1.0000 0.9999 0.9936 0.9803 0.8338 0.6230 


6 

1.0000 0.9991 0.9965 0.9452 0.8281 


7 

0.9999 0.9996 0.9877 0.9453 


8 

1.0000 1.0000 0.9983 0.9893 


9 

0.9999 0.9990 
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Table B.l. Continued 




P 

0.05 0.10 0.20 0.25 0.40 0.50 

n 

X 


15 

0 

0.4633 0.2059 0.0352 0.0134 0.0005 0.0000 


1 

0.8290 0.5490 0.1671 0.0802 0.0052 0.0005 


2 

0.9638 0.8159 0.3980 0.2361 0.0271 0.0037 


3 

0.9945 0.9444 0.6482 0.4613 0.0905 0.0176 


4 

0.9994 0.9873 0.8358 0.6865 0.2173 0.0592 


5 

0.9999 0.9977 0.9389 0.8516 0.4032 0.1509 


6 

1.0000 0.9997 0.9819 0.9434 0.6098 0.3036 


7 

1.0000 0.9958 0.9827 0.7869 0.5000 


8 

0.9992 0.9958 0.9050 0.6964 


9 

0.9999 0.9992 0.9662 0.8491 


10 

1.0000 0.9999 0.9907 0.9408 


11 

1.0000 0.9981 0.9824 


12 

0.9997 0.9963 


13 

1.0000 0.9995 


14 

1.0000 

20 

0 

0.3585 0.1216 0.0115 0.0032 0.0000 


1 

0.7358 0.3917 0.0692 0.0243 0.0005 0.0000 


2 

0.9245 0.6769 0.2061 0.0913 0.0036 0.0002 


3 

0.9841 0.8670 0.4114 0.2252 0.0160 0.0013 


4 

0.9974 0.9568 0.6296 0.4148 0.0510 0.0059 


5 

0.9997 0.9887 0.8042 0.6172 0.1256 0.0207 


6 

1.0000 0.9976 0.9133 0.7858 0.2500 0.0577 


7 

0.9996 0.9679 0.8982 0.4159 0.1316 


8 

0.9999 0.9900 0.9591 0.5956 0.2517 


9 

1.0000 0.9974 0.9861 0.7553 0.4119 


10 

0.9994 0.9961 0.8725 0.5881 


11 

0.9999 0.9991 0.9435 0.7483 


12 

1.0000 0.9998 0.9790 0.8684 


13 

1.0000 0.9935 0.9423 


14 

0.9984 0.9793 


15 

0.9997 0.9941 


16 

1.0000 0.9987 


17 

0.9998 


18 

1.0000 
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Table B.2. Distribution function of the Poisson distribution 



a 

0.5 1 1.5 2 5 10 15 20 

X 


0 

0.6065 0.3679 0.2231 0.1353 0.0067 0.0000 

1 

0.9098 0.7358 0.5578 0.4060 0.0404 0.0005 

2 

0.9856 0.9197 0.8088 0.6767 0.1247 0.0028 0.0000 

3 

0.9982 0.9810 0.9344 0.8571 0.2650 0.0103 0.0002 

4 

0.9998 0.9963 0.9814 0.9473 0.4405 0.0293 0.0009 0.0000 

5 

1.0000 0.9994 0.9955 0.9834 0.6160 0.0671 0.0028 0.0001 

6 

0.9999 0.9991 0.9955 0.7622 0.1301 0.0076 0.0003 

7 

1.0000 0.9998 0.9989 0.8666 0.2202 0.0180 0.0008 

8 

1.0000 0.9998 0.9319 0.3328 0.0374 0.0021 

9 

1.0000 0.9682 0.4579 0.0699 0.0050 

10 

0.9863 0.5830 0.1185 0.0108 

11 

0.9945 0.6968 0.1848 0.0214 

12 

0.9980 0.7916 0.2676 0.0390 

13 

0.9993 0.8645 0.3632 0.0661 

14 

0.9998 0.9165 0.4657 0.1049 

15 

0.9999 0.9513 0.5681 0.1565 

16 

1.0000 0.9730 0.6641 0.2211 

17 

0.9857 0.7489 0.2970 

18 

0.9928 0.8195 0.3814 

19 

0.9965 0.8752 0.4703 

20 

0.9984 0.9170 0.5591 

21 

0.9993 0.9469 0.6437 

22 

0.9997 0.9673 0.7206 

23 

0.9999 0.9805 0.7875 

24 

1.0000 0.9888 0.8432 

25 

0.9938 0.8878 

26 

0.9967 0.9221 

27 

0.9983 0.9475 

28 

0.9991 0.9657 

29 

0.9996 0.9782 

30 

0.9998 0.9865 

31 

0.9999 0.9919 

32 

1.0000 0.9953 
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Table B.3. Values of the function @(z) 


z 

+0.00 

+0.01 

+0.02 

+0.03 

+0.04 

+0.05 

+0.06 

+0.07 

00 

o 

o 

+ 

+0.09 

0.0 

0.5000 

0.5040 

0.5080 

0.5120 

0.5160 

0.5199 

0.5239 

0.5279 

0.5319 

0.5359 

0.1 

0.5398 

0.5438 

0.5478 

0.5517 

0.5557 

0.5596 

0.5636 

0.5675 

0.5714 

0.5753 

0.2 

0.5793 

0.5832 

0.5871 

0.5910 

0.5948 

0.5987 

0.6026 

0.6064 

0.6103 

0.6141 

0.3 

0.6179 

0.6217 

0.6255 

0.6293 

0.6331 

0.6368 

0.6406 

0.6443 

0.6480 

0.6517 

0.4 

0.6554 

0.6591 

0.6628 

0.6664 

0.6700 

0.6736 

0.6772 

0.6808 

0.6844 

0.6879 

0.5 

0.6915 

0.6950 

0.6985 

0.7019 

0.7054 

0.7088 

0.7123 

0.7157 

0.7190 

0.7224 

0.6 

0.7257 

0.7291 

0.7324 

0.7357 

0.7389 

0.7422 

0.7454 

0.7486 

0.7517 

0.7549 

0.7 

0.7580 

0.7611 

0.7642 

0.7673 

0.7704 

0.7734 

0.7764 

0.7794 

0.7823 

0.7852 

0.8 

0.7881 

0.7910 

0.7939 

0.7967 

0.7995 

0.8023 

0.8051 

0.8078 

0.8106 

0.8133 

0.9 

0.8159 

0.8186 

0.8212 

0.8238 

0.8264 

0.8289 

0.8315 

0.8340 

0.8365 

0.8389 

1.0 

0.8413 

0.8438 

0.8461 

0.8485 

0.8508 

0.8531 

0.8554 

0.8577 

0.8599 

0.8621 

1.1 

0.8643 

0.8665 

0.8686 

0.8708 

0.8729 

0.8749 

0.8770 

0.8790 

0.8810 

0.8830 

1.2 

0.8849 

0.8869 

0.8888 

0.8907 

0.8925 

0.8944 

0.8962 

0.8980 

0.8997 

0.9015 

1.3 

0.9032 

0.9049 

0.9066 

0.9082 

0.9099 

0.9115 

0.9131 

0.9147 

0.9162 

0.9177 

1.4 

0.9192 

0.9207 

0.9222 

0.9236 

0.9251 

0.9265 

0.9279 

0.9292 

0.9306 

0.9319 

1.5 

0.9332 

0.9345 

0.9357 

0.9370 

0.9382 

0.9394 

0.9406 

0.9418 

0.9429 

0.9441 

1.6 

0.9452 

0.9463 

0.9474 

0.9484 

0.9495 

0.9505 

0.9515 

0.9525 

0.9535 

0.9545 

1.7 

0.9554 

0.9564 

0.9573 

0.9582 

0.9591 

0.9599 

0.9608 

0.9616 

0.9625 

0.9633 

1.8 

0.9641 

0.9649 

0.9656 

0.9664 

0.9671 

0.9678 

0.9686 

0.9693 

0.9699 

0.9706 

1.9 

0.9713 

0.9719 

0.9726 

0.9732 

0.9738 

0.9744 

0.9750 

0.9756 

0.9761 

0.9767 

2.0 

0.9772 

0.9778 

0.9783 

0.9788 

0.9793 

0.9798 

0.9803 

0.9808 

0.9812 

0.9817 

2.1 

0.9821 

0.9826 

0.9830 

0.9834 

0.9838 

0.9842 

0.9846 

0.9850 

0.9854 

0.9857 

2.2 

0.9861 

0.9864 

0.9868 

0.9871 

0.9875 

0.9878 

0.9881 

0.9884 

0.9887 

0.9890 

2.3 

0.9893 

0.9896 

0.9898 

0.9901 

0.9904 

0.9906 

0.9909 

0.9911 

0.9913 

0.9916 

2.4 

0.9918 

0.9920 

0.9922 

0.9925 

0.9927 

0.9929 

0.9931 

0.9932 

0.9934 

0.9936 

2.5 

0.9938 

0.9940 

0.9941 

0.9943 

0.9945 

0.9946 

0.9948 

0.9949 

0.9951 

0.9952 

2.6 

0.9953 

0.9955 

0.9956 

0.9957 

0.9959 

0.9960 

0.9961 

0.9962 

0.9963 

0.9964 

2.7 

0.9965 

0.9966 

0.9967 

0.9968 

0.9969 

0.9970 

0.9971 

0.9972 

0.9973 

0.9974 

2.8 

0.9974 

0.9975 

0.9976 

0.9977 

0.9977 

0.9978 

0.9979 

0.9979 

0.9980 

0.9981 

2.9 

0.9981 

0.9982 

0.9982 

0.9983 

0.9984 

0.9984 

0.9985 

0.9985 

0.9986 

0.9986 
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Table B.3. Continued 




+0.00 +0.01 +0.02 +0.03 +0.04 +0.05 +0.06 +0.07 +0.08 +0.09 


3.0 

3.1 

3.2 

3.3 

3.4 


0.9987 

0.9990 

0.9993 

0.9995 

0.9997 


0.9987 0.9987 
0.9991 0.9991 
0.9993 0.9994 
0.9995 0.9995 
0.9997 0.9997 


0.9988 

0.9991 

0.9994 

0.9996 

0.9997 


0.9988 

0.9992 

0.9994 

0.9996 

0.9997 


0.9989 0.9989 
0.9992 0.9992 
0.9994 0.9994 
0.9996 0.9996 
0.9997 0.9997 


0.9989 

0.9992 

0.9995 

0.9996 

0.9997 


0.9990 

0.9993 

0.9995 

0.9996 

0.9997 


0.9990 

0.9993 

0.9995 

0.9997 

0.9998 


3.5 

3.6 

3.7 

3.8 

3.9 


0.9998 0.9998 
0.9998 0.9998 
0.9999 0.9999 
0.9999 0.9999 
1.0000 1.0000 


0.9998 0.9998 
0.9999 0.9999 
0.9999 0.9999 
0.9999 0.9999 
1.0000 1.0000 


0.9998 0.9998 
0.9999 0.9999 
0.9999 0.9999 
0.9999 0.9999 
1.0000 1.0000 


0.9998 0.9998 
0.9999 0.9999 
0.9999 0.9999 
0.9999 0.9999 
1.0000 1.0000 


0.9998 0.9998 
0.9999 0.9999 
0.9999 0.9999 
0.9999 0.9999 
1.0000 1.0000 


Table B.4. Values of the function Q 1 (p) for some values of p 


V 

0.10 0.05 0.01 0.005 0.001 0.0001 0.00001 

Q~\v) 

1.282 1.645 2.326 2.576 3.090 3.719 4.265 
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Solutions to “Solved exercises” 


Chapter 1 


Question no. 1 

When x decreases to 0, 1/x increases to oc. The function sinx does not converge 
as x tends to oc. However, because —1 < sinx < 1, for any real x, we may conclude 
that lim^o x sin(l/x) = 0. This result can be proved from the definition of the limit of 
a function. We have: 


0 < |x| < e => |xsin(l/x)| < |x| < e. 

Hence, we can take S = e in Definition 1.1.1 and we can actually write that 


lim x sin(l/x) = 0 . 


Note that the function /(x) := x sin (1/x) is not defined at x = 0. 

Question no. 2 

The functions /i(x) := sinx and / 2 (x) •= x are continuous, for any real x. Moreover, 
it can be shown that if /i(x) and /^(x) are continuous, then g(x) := /i(x)// 2 (x) is also 
a continuous function, for any x such that / 2 (x) ^ 0. Therefore, we can assert that /(x) 
is continuous at any x/ 0 . 

Next, using the series expansion of the function sinx: 

is 1 5 1 7 

smx — x — —x + —x -—x +•••, 


we may write that 


sin x ^ 1 2 1 4 1 

— = 1 - 3!* + 5! 1 - 7!* + 


X 




282 


C Solutions to “Solved exercises’ 


Hence, we deduce that 


smx 

lim- = 1. 


cc—>0 X 

Remark. To obtain the previous result, we can also use 1’Hospital’s rule: 


.. smx _ cosx 

lim - = lim-= 1. 


x^O X 


x—>0 1 


Because, by definition 

H sinx 

/(0) = 1 = lim-, 

x—>0 X 

we conclude that the function f(x) is continuous at any x G M. 

Question no. 3 

Let g{x) = \/3x + 1 and h{x) = ( 2x 2 + l) 2 . We have, using the chain rule : 


9 '( x ) = V/^yf( 3 ) and = 2(2x 2 + l)(4a;). 


Thus, 


/'(X) = g'(x)h(x) + g(x)h'(x) = (2a; 2 + l) 2 + \/3x + l(8:r)(2x 2 + 1) 

2y3x + 1 

« /'(x) = (2x 2 + 1) {. 


Question no. 4 

We have: 

\imx\nx = lim x lim In x = 0 x (— 00 ), 

a?t0 xj,0 cc|0 

which is indeterminate. Writing that 


xlnx 


\nx 

1/x ’ 


we obtain that 


lim xlnx = lim — 

a40 adO 1/X 


— OO 
OO 


We can then use 1’Hospital’s rule: 


In a; 1/x 

lim —- = lim 

x|0 1/x adO —1/x 2 


= lim —x = 0. 

xiO 
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Question no. 5 

In as much as the derivative of Inx is 1/x, we can write that 


-i; 


Inx 7 (Inx) 2 

- ax = ^ 

x 2 


(In e) 2 — (In l) 2 


This result can be checked by using the integration by substitution method: setting 
y = \nx x = e y , we deduce that 


/ lf dX = / (y^' dy = J ydy= \ y2 ' 


It follows that 



l 

o 


1 

2 


Question no. 6 

We use the integration by parts technique. We set 

u = x 2 and dv - xe~ x ^ 2 dx. 


Because v = — e x / 2 , it follows that 


/ 6 = - x 2 e x2/2 


OO r-oo 

+ J 

— OO J —OO 


2xe x ! 2 dx. 


By I’Hospital’s rule, the above constant term is equal to 0, and the integral is given by 


f 


2 xe~ x2/2 dx = -2e~ x2/2 


= 0 . 


Hence, we have that Iq = 0. 

Remark. In probability, we deduce from this result that the mathematical expectation 
or expected value of the cube of a standard normal random variable is equal to zero. 

Question no. 7 

We have: 


F(u) = [ e jujx ce~ cx dx = 

J o 


y 


^(juj-c)x^ x _ 


0 (ju-c)x 


JUJ-C 


C-JUJ 


Remarks, (i) The fact that j is a (pure) imaginary number does not cause any problem, 
because it is a constant. 

(ii) In probability, the function F(w) obtained above is the characteristic function of a 
random variable having an exponential distribution with parameter c. 
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Question no. 8 

We can write that 


a Vv r 1 f x 2 Vv ^ 

(x + y)dxdy = jy + y^/yjdy 


~ / I + y3/2dy = T + ¥/2 


y 2 y5 /2\l 


0 

1 2 _ 13 

4 + 5 “ 20' 


Or: 


[ [ {x + y)dydx= ( [x{ 1 - x 2 ) + ^- X 
JO Jx 2 JO l Z x 2 J 

a 


9 4 5 

nr* ry» ^ 

xO-^)^,-^}dx= Y -- + --- 


1 X 4 

2 2 

111 1 _ 13 

2 _ 4 + 2 _ To _ 20 


Remark. We easily find that 


l 


(x + y)dxdy = 1. 


o Jo 


Hence, if B := {(x,y) G 
symmetry) that 


• 0 < x < 1,0 < y < l,x < y }, then we can write (by 

J J \x + y) dxdy = 1. 


B 


Question no. 9 

Consider the geometric series 


5(1/2,1/2) - + - + - + ■ 


This series converges to 1. Hence, we can write that 

59 = 1 ~ \ ~ \ = 


Ed/ 2 )"- 


n=1 


Or: 


S 9 «i,S( 1/2,1/2) = 


Remark. The sum Sq represents the probability that the number of tosses needed to 
obtain “heads” with a fair coin will be greater than two. 






Question no. 10 

We have: 
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We can differentiate this series term by term (twice). We obtain that 



Because 


oo 


oo 


J2(k-l)(k-2)(l-p) k ~ 3 


Y J (k-l){k-2)(l-p ) k ~ 3 


k=1 


oo 


oo 


oo 


= X++ 1 )( n )( 1 -p) n 1 


J^n^l-pr^+J^nil-pr - 1 


n= 1 


n= 1 


n= 1 


we deduce (see Example 1.4.1) that 



2 -p 


Remark. The sum calculated above is the average value of the square of a geometric 
random variable. 


Chapter 2 


Question no. 1 

We have that Q = {1, 2, 3,4, 5, (6,1),..., (6, 6)}. Thus, there are 5 + 6 = 11 elemen¬ 
tary outcomes. 

Question no. 2 

Four different partitions of Q can be formed: {ei}, {e 2 , e^}, or {e 2 }, {ei, e^}, or 
{e 3 }, {e 1 ,e 2 }, or {ei}, {e 2 }, {e 3 }. 

Question no. 3 

Let Si be the event “the sum of the two numbers obtained is equal to i” and let Dj^ 
be “the number obtained on the jth roll is We seek 


P[S 4 | Di,2 U Hi,4 U D i ?6 ] — 


P [ D 1,2 0 -P 2 ,2] ind. (1/6)! = 1/18 
P[D h2 UD 1A UD h6 ] 1/2 
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Question no. 4 

We have that P[A \ B ] = P[B\ = P[A] = 1/4 => A and B are independent events. 
It follows that 

P[A H B'] = P[A]P[B'} = ( 1 /4)(3/4) = 3/16. 

Question no. 5 

Let Fi be the event “component i operates” and let F$ be “the system operates.” 
By symmetry and incompatibility, we can write that 

P[F S ] = 3xP[F 1 nF 2 nF£+P[F 1 nF 2 nF 3 ] 

’=• 3 x (0.95)(0.95)(0.05) + (0.95 ) 3 = 0.99275. 


Question no. 6 

We have: 

P[A] = P[A flB] + P[A n B'] m 1 + P[A \ B']P[B'} 



Question no. 7 

Let Ai be the event “i ‘heads’ were obtained.” We seek 


P[A 3 | A x U A 2 U A 3 ] = 


p\M 

P[Ai U A 2 U A 3 ] 


ind. 


(1/2) 3 

1 - (1/2)3 


1 

7' 


Question no. 8 

We have: 

p , A | 01 __ P[B I AilPjAil _ p[ Ai ]=p[a 2 ] 1/2 = 2 

1 1 1 J P[B | A 1 ]P[A 1 ] + P[B | A 2 ]P[A 2 ] 1/2 + 1/4 3' 


Question no. 9 

We have: 

A = {a 6 c, acb , cab} and B = {a 6 c, ac 6 , bac}. 

(a) Because ADB = {a 6 c, acb} j ^ 0, A and B do not form a partition of Q. (In addition, 

AU5 / Q.) 

(b) P[A] = jg + 4 - | = P[B\ and P[A D B] = P [{a 6 c, acb}] = |. Because 

P[A]P[B]= 1 -- 1 - = ^ = P[AnB], 
we can assert that A and B are independent events. 
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Question no. 10 

Let Ai be the event “A occurs on the first repetition.” Likewise for B\ and C\. Then, 
we may write that 


P[D] 

P[D} 

P[D] 


= P[D | A^PlAi] +P[D | B^P^} + P[D \ C^C i] 
= 1 • P[A] + 0 + P[D]P[C} 

P[3 P[A] 

1 - P[C] P[A] + P[B }' 


Question no. 11 

Let A be the event “the transistor tested is defectless” and let D be “the transistor 
tested is defective.” Then, we have that Q = {D, AD, AAD, AAA}. 

Remark. We could also write that Q = {A', AA ', AAA', AAA}. 


Question no. 12 

We have that P[B \ A] = 1 — P[B' \ A] = 2/7 and 


P[B | A] 


P[B n A] 

m 


fT] because B c A. 
P[A\ 


It follows that 

P[B] = P[B | A\P[A] = Jr 0.0952. 


Question no. 13 

Let D be the event “the teacher holds a PhD,” let F (resp., A, B) be “the teacher 
is a full (resp., associate, assistant) professor,” and let L be “the teacher is a lecturer.” 
We can write that 

P[D\ = P[D | F]P[F] + P[D | A\P[A\ + P[D \ B]P[B] + P[D \ L\P[L\ 

= (0.6)(0.3) + (0.7)(0.4) + (0.9)(0.2) + (0.4)(0.1) = 0.68. 


Question no. 14 

The number of different codes is given by 

26 x 25 x 24 x 23 x 22 = 7,893,600 (= P 5 26 ). 


Question no. 15 

We have that Q = {(1,1), (1, 2),..., (6, 6)}. There are 36 equiprobable elementary 
outcomes. 

(a) B = {(1, 6), (2, 5), (3,4), (4, 3), (5, 2), (6,1)} and C = B U {(5,6), (6, 5)}. Therefore, 
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P[B | C] 


P[B n C] 

m 


P[B] 

P[C] 


6/36 

8/36 


= 3/4. 


(b) We have: 


P[A | B] 


PjAnB] (a) P[{(6,1)}] 1/36 

P[B} 6/36 6/36 7 


(c) Because the die is fair, we may write that P[A] = 1/6 == P[A \ B]. Hence, A and B 
are independent events. 


Question no. 16 

(a) Let the events be 

A = “the commuter gets home before 5:30 p.m.;” 
B = “the commuter uses the compact car.” 

We seek 


P[A\ = P[A | B\P[B\ + P[A | B']P[B'} 

= (0.75)(0.75) + (0.60)(0.25) = 0.7125. 


(b) We calculate 

P[B | A'] = P[A' | B]B^ ( = } (1 - 0.75) ^°/^ ~ 0.6522. 

(c) We have that P[A' D B '] = P[A' \ B']P[B'} = (1 - 0.60)(0.25) = 0.1. 

(d) By independence, we seek 

P[A n B] ■ P[A n B'} + P[A n B'} • P[A n B] 

= 2 P[A | B}P[B] ■ P[A | B'}P[B'\ = 2(0.75) 2 • (0.60)(0.25) 

= 0.16875. 


Question no. 17 

We define the events A = “it is raining,” B = “rain was forecast,” and C = “Mr. X 
has his umbrella.” We seek 

P[A n c f ] = P[AnB_n£) + p[A nB'n C'] = P[A nB'n C'] 

0 

= P[A | B' n c']P[B’ n C’\ = -P[C | b']p[b’} 

12 11 

= -x-x- = -. 

3 3 2 9 
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Question no. 18 

We have that Q = {(1,1,1),..., (1,1,6),..., (6, 6, 6)}. There are 6 3 = 216 elemen¬ 
tary outcomes, which are all equiprobable. By enumeration , we find that there are 20 
triples for which event F occurs: 

(1,2,3), (1,2,4), (1,2,5), (1,2,6), (1,3,4),... . 

Consequently, the probability asked for is ^ — 0.0926. 

Question no. 19 

We have that Q = {(G,G), (G,B), (B,G), (B,B)}. Moreover, 

A 1 = {(G,B),(B,G)} and A 2 = {(G, B), (B, G), (B, B)}. 

(a) Because A ' 2 = {(G,G)}, we have that A\ D A' 2 = 0. Therefore, A\ and A 2 are 
incompatible. 

(b) We may write that P[Ai] =2/4 and P[A 2 ] =3/4. Given that 

P[A t n A 2 ] = P[Ai\ = 2/4 ± PIA^PIA^ 

A\ and A 2 are not independent. 

(c) Let Bi (resp., Gi) be the event “the ith child is a boy (resp., a girl).” We first 
calculate 


P[Bs] = P[B 3 1 b x n B 2 \P[B X n b 2 ] + p[b 3 \ G 1 n g 2 ]p[Gi n g 2 ] 
+ P[B 3 1 (B x n g 2 ) u (Gi n b 2 )]p[(b i n g 2 ) u (G x n b 2 )\ 



Hence, we may write that 


P[Bi n b 2 | b 3 \ 


p[b 3 I B 1 nB 2 \P[B 1 nB 2 ] 
P[B 3 ] 


11 1 
20 ' 4 

39 

80 


0.2821. 


Chapter 3 


Question no. 1 

We must have that 1 = | + | + | = |. Therefore, a must be equal to 2. 
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Question no. 2 

We calculate 

Fx(0) = /_J(l-x 2 ),t = ! (*-y)L 


1 

2 


Remark. This result could have been deduced from the symmetry of the function fx(x) 
about x = 0. 

Question no. 3 

First, we calculate E[X] «= |(1 + 2 + 3) = 2. Next, we have: 

£[X 2 ] = t(l 2 + 2 2 + 3 2 ) = ^ =► VAR[X] = y - 2 2 = | 

Thus, we may write that STD[X] = ^/2/3. 

Question no. 4 

We have: 

r 1 -t-5/2 

£[X 1/2 ] = j x 1/2 -2xdx=2 — 



Question no. 5 

We can write that Xq >75 /4 = 0.25 and X0.75 > 0. It follows that X0.75 = 1. 

Question no. 6 

We have: 

fv(y ) = fx(y- 1 ) 


■x~(y ~ 1 ) 

dy 


= - if 1 < y < 3. 
2 * 


Question no. 7 

By definition, X ~ Hyp(7V = 15, n = 2, d = 5). 


Question no. 8 

According to Table B.l, page 276, x = 1 is 
0.6328 - 0.2373 = 0.3955. 


the most probable value, with px{ 1) — 


Question no. 9 

We seek P[B(10,0.1) = 2] Tab ~ B 1 0.9298 - 0.7361 = 0.1937. 

Question no. 10 

We have: 


P[X > 1 | V < 1] = 


3“ 5 • 5 


P[X = !] _ 

P[X < 1] e~ 5 + e~ 5 • 5 


5 

6 ' 
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Question no. 11 

We want P[ Poi (2 • 2 ) = 1 ] = e -4 • 4 ~ 0.0733. 

Question no. 12 

We can write that 


P [Hyp (TV = 250, n = 5, d = 50) = 0] 


P [B (n = 5, p = 50/250) = 0] 
4\ 5 

- | - 0.3277. 


Question no. 13 

We have: 

P[B(50,0.01) >4] ~ P[ Poi(l/ 2 ) > 4] = 1 - P[ Poi(l/ 2 ) < 3] 

Tab ~ B ' 2 1 - 0.9982 = 0.0018. 


Question no. 14 

We seek ^ 

P[Geo(p = 1/2) = 5] = (Pj (Pj=±= 0.03125. 

Question no. 15 

By the memoryless property of the exponential distribution, we may write that 
P[X > 20 I V > 10] = P[X > 10] = P[ Exp(A = 1/10) > 10] = e -1 ~ 0.3679. 

Question no. 16 

We have: 

P[X > 1] = 2 P[X > 2] <s=^ e~ x = 2e“ 2A e A = 2 <s=^> A = In2. 

Question no. 17 

We can write that 


P[G(a = 2 , A = 1 ) < 4] = P[ Poi(l • 4) > 2]. 

So, we can use a Poisson distribution with parameter 4. 

Question no. 18 

We can write that X ~ G(a = 10 , A = 2 ). 
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Question no. 19 

We have: 

P[|N(0,1)| < 1/2] s ^' 2£(1/2) - 1 Tab ~ B ' 3 2 (0.6915) - 1 = 0.3830. 

Question no. 20 

We have: 

^ 0.90 — l^x + ^ 0.90 • &x — 1 + (—1.282) \/2 —0.813. 

Question no. 21 

(a) Let X be the number of down components. Then, we have that X ~ B (n = 5,p = 
0.05). We seek P[X < 1] Tab ~ B 2 0.977. 

(b) Let Y be the number of devices needed to obtain a first device that does not operate. 

We have that Y ~ Geo (p = P[X > 1] ci 1 — 0.977). Therefore, E[Y] = 1/p ~ 43.5 
devices. 

Question no. 22 

(a) Let N(t) be the total number of buses passing in a time period of t hours. Then, 
N(t) ~ Poi(4t). We seek 

P[N( 1/2) < 1] = P[ Poi(2) < 1] = e -2 + 2e“ 2 ~ 0.4060. 

(b) Let T be the waiting time between the first and the third bus. Then, T ~ G(a = 
2, A = 4), so that VAR[T] = a/X 2 = 1/8 (hour) 2 . 

(c) Let W be the total waiting time (in minutes). Then, W ~ Exp(l/15). We want 

P[W > 20 | W > 5] = P[W > 15] = e- 15 / 15 ~ 0.3679. 

Question no. 23 

(a) We calculate 

P[ N(m, (0.1m) 2 ) > 1.15m] = 1 - £ ( 11 Q I ' 1/J /J ) 

= 1 - £(1.5) Tab ~ B 3 i _ 0.9332 = 0.0668. 

(b) We seek x 0 .io == l*> + 2 0 .io * & Tab ^ B ‘ 4 4 _j_ (—1.282)(0.1)(4) ~ 3.49. 

Question no. 24 

Let X be the number of correct answers that the student gets. Then, X follows a 
B(n = 10,p = 0.5) distribution. We want 
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P[X > 5] = 1 - P[X < 5] Tab ~ B - 1 1 _ 0.6230 = 0.3770. 


Question no. 25 

We have: 


P[B(100,0.1) = 15] ~ P[ Poi(10) = 15] = e“ 10 —— ~ 0.0347. 

lo! 


Question no. 26 

We can carry out the required integral to obtain the desired probability. However, 
we notice that the function fx{x) is symmetrical about the origin. That is, fx(~x) = 
fx (x ). Then, given that X is a continuous random variable that is defined in a bounded 
interval, we may write that P\X < 0 ] = 1 / 2 . 

Question no. 27 

Let X be the IQ of the pupils. Then, X ~ N(100, (15) 2 ). We look for 


P[{X < 91} U {X > 130 }} = l-P 


91 - 100 <N(0,1)< 130 - 100 


15 


15 


= 1 - [<2>(2) - <2>(-0.6)] Tab ^ B ' 3 i _ [0.9772 - (1 - 0.7257)] = 0.2971. 


Question no. 28 

(a) Let X be the number of defective transistors. Then, X has a B(n = 60,p = 0.05) 
distribution. We seek 

P[X < 1] = (0.95 ) 60 + r °) (0.05)(0.95) 59 = (0.95) 59 (0.95 + 3) ~ 0.1916. 


(b) We can write that 

P[X < 1] ~ P[Poi(60 x 0.05 = 3) < 1] = e“ 3 (l + 3) ~ 0.1991. 

(c) Let W be the number of transistors that we must take to get 59 nondefective ones. 
Then, W ~ NB(r = 59 ,p = 0.95). We want 

P[W = 60] = (^j (0.95) 59 (0.05) = 2.95 (0.95 ) 59 ~ 0.1431. 
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Question no. 29 

By the Bienayme-Chebyshev inequality, we may write that 


P[7 - k- l<X<7 + k-l]>l 


1 


We want that 1 — -p- = 0.9, from which we deduce that k = \/T(). Hence, the interval 
asked for is [7 - VlO, 7 + VlO] ~ [3.84,10.16]. 

Question no. 30 

We have: 


fx{x) = 



Because E[X 2 } = VAR[X] + (E[X}) 2 = 


H = E 


— In 



=*■ 1,,|/ - l(x)1 = ln (2^) - T- 

2 + 0 2 = 2, we can then write that 
P =ln(2 V^) + \e[X 2 } 

= \n(2\fn) + ^ — 1-766. 


Question no. 31 

(a) Let X be the total number of devices that the technician will have tried to repair 
at the moment of his second failure. Then, we have that X ~ NB(r = 2,p = 0.05). We 
seek 

P[X = 7] = ^) (0.05) 2 (0.95) 5 ~ 0.0116. 

Remark. If the fact that the technician will receive at least seven out-of-order devices 
during this particular workday had not been mentioned in the statement of the problem, 
then we would have had to multiply the above probability by that of receiving at least 
seven out-of-order devices during an arbitrary workday, namely 

P[N > 7] = P[N > 6] = (7/8) 6 ~ 0.4488. 


(b) We want 

P[B(n = 10 ,p = 0.95) = 8] = ^) (0.95) 8 (0.05) 2 ~ 0.0746. 


Remark. We can also write that 

P[B(n = 10,p = 0.95) = 8] = P[B{n = 10,p = 0.05) = 2] 

Ta ~ B 1 0.9885 - 0.9139 = 0.0746. 
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(c) We have: 


P[B(n = 10, p = 0.95) = 8] = P[B(n = 10, p = 0.05) = 2] 


~ P[Poi(A = 10 x 0.05) = 2] = e -°- 5 ^h ~ 0.0758. 


Remark. We could also have used Table B.2, page 278, to get P[Poi(A = 0.5) = 2]. 

(d) Let M be the number of devices that the technician could not repair, among the 
three taken at random. Then, M ~ Hyp(7V = 10, n = 3, d = 2). We seek 



A ~ 0.0667. 


Question no. 32 

Let Y be the number of cookies, among the 20, containing no raisins. If we assume 
that the number of raisins in a given cookie is independent of the number of raisins in 
the other cookies, then we may write that Y ~ B(n = 20, p = P[X = 0]). We have that 
P[X = 0] = e -A . Moreover, we find in Table B.2, page 278, that 


P[Y < 2] ~ 0.9245 if p = 0.05. 


It follows that we must take A ~ —In 0.05 ~ 3. 
Remark. We can check that with A = 3, we have: 


P[Y < 2] ~ (0.9502) 20 + 20(0.9502) 19 (0.0498) 
+ ( 2 2 °)(0.9502) 18 (0.0498) 2 ~ 0.925. 


190 


However, A did not have to be an integer, because it is the average number of raisins 
per cookie. 

Question no. 33 

Let c be the capacity of the storage tank. We want P[X > c] to be 0.01. We have: 



Thus, the capacity must be 


c = —10 (In0.01) ^ 46 (thousands of liters). 
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Question no. 34 

We are interested in the lifetime X (in years) of a machine. From past experience, 
we estimate the probability that a machine of this type lasts for more than nine years 
to be 0.1. 

(a) We have: 



Then, 



We find that b = 2, which implies that a = 1 

(b) We want 



Therefore, a must be approximately equal to 1.56. 

(c) Let Y be the number of machines, among the ten, that will last for less than nine 
years. Then, Y ~ B (n = 10 ,p = 0.9). We seek 


P[Y E {8, 9}] = P[B(n = 10, p = 0.1) E {1, 2}] 

Tab ~ B ' 1 0.9298 - 0.3487 = 0.5811. 


Tab. B.l 


Chapter 4 


Question no. 1 

We have: 



Question no. 2 

We find that 



Then, we may write that 


fx{x\Y = y) = y~~ — if 0 < x < 1 and 0 < y < 1. 


2 +y 


1 

2 
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Question no. 3 

We first find that VAR[X] = VAR[R] = 1 - 0 2 = 1. Then, 
COV[X, V] = Px,Y&X&Y = 1 • 1 • 1 = 1. 


Question no. 4 

We can write that P[X + Y > 1] = 1/2, by symmetry. This result can be checked 
as follows: 

P[X + Y > 1] = [ [ 1 dydx = [ [1 - (1 - x)\ dx = 

Jo J i—x Jo 2 


Question no. 5 

We have: 

= (!)(!) (|) ( i ) 1+1 + ( D ( 2 ) (!) (|) l+2 



4 

9 


Question no. 6 

We have: 

VAR[X-2 Y] = VAR[A] +4VAR[F] -4COV[X,F] = 1 + 4 - 4(1) = 1. 


Question no. 7 

We can write that W ~ N(0 - 1 + 2(3), 1 + 2 + 4(4)) = N(5,19). 

Question no. 8 

By the central limit theorem, we may write that 


P[Poi(100) < 100] ~ P 


N(100,100) < 100 + 


1 

2 


<P(0.05) Tab ~ B ' 3 0.5199 


because if X ~ Poi(100), then X has the same distribution as (for instance) 

where the XiS are independent random variables having a Poisson distribution with 

parameter 1. Thus, we can use a N(100,100) distribution. 

Remark. Note that not all authors make a continuity correction in this case (as in the 
case of the binomial distribution). If we do not make this continuity correction, then we 
obtain directly that 

P[Poi(100) < 100] ~ P [N(100,100) < 100] = #(0) = 

Actually, we find, with the help of a mathematical software package, that P[Poi(100) < 
100] ~ 0.5266. Thus, here the fact of making a continuity correction does improve the 
approximation. 
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Question no. 9 

We have: 


Or: 


P[X = 40] = P[39.5 < X < 40.5] ~ P[39.5 < N(40, 24) < 40.5] 
'40.5-40 


= 20 


V24 


- 1 ~ 2<P(0.10) - 1 Tab ~ B ' 3 0.0796. 


P[X = 40] 


/y (40), where Y ~ N(40,24), 

1 f 1 (40 - 40) 2 } _ 1 

V^M exp { 2 24 J “ 


0.0814. 


Remark. The answer obtained by using the binomial distribution and a software package 
is P[X = 40] ~ 0.0812, which is also the value of the probability that we find by 
calculating (with more accuracy than above) 0( (40.5—40)/v^) — 0(0.102) rather than 
0(0.10). 

Question no. 10 

By the central limit theorem, we may write that 

50 

Y :=]TV«N(50(0),4), 

i=1 

so that P[Y > 0] ~ P[N(0, cry) > 0] = 1/2. 

Question no. 11 

(a) We first calculate px{%) = 1/3, py(0) = 1/3, and py( 2) =2/3. We can check that 

Px(x)py{v) = Px,y( x iV) So? X and Y are independent. 

(b) Fx,y( 0,1/2) = P[X < 0, Y < 1/2] = px,y(~ 1,0) Px,y( 0,0) = 2/9. 

(c) From part (a), we have that px{%) = 1/3, for x = —1, 0,1. Then, 


z 

0 1 

pz(z) 

1/3 2/3 


(d) We have that E[X 2 Y 2 ] = (-1) 2 (2) 2 § 

Question no. 12 

We have (see Figure 


( 1 ) 2 ( 2 ) 


2 2 
9 


o = f. 


!.l) 


P P / x ^/ 2 x 2 \ 

P[X > Y 2 ] = JJ 2dydx = J 2(y/x - x)dx = 2 [ — - y J 


1 

3' 
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Fig. C.l. Figure for solved exercise no. 12. 


Question no. 13 

(a) Let N(t) be the number of buses in the interval [0, £], where t is in hours. We can 
write that Y = N(t = 12.5) ~ Poi(50). 

(b) We have that Xk ~ Poi(l), so that E[Xk] = VAR[X/c] = 1, for all k. Then, given that 
the X&S are independent random variables, we deduce from the central limit theorem 
that Y « N(50, 50). 

Question no. 14 

We have: 

2 2 

P[Xi =x 2 ] = J2 p [i x i = n {V 2 = »}] ! = Y, p \ x i = i]P[X 2 = i] 

i =0 i =0 

= (1/2) 2 + (1/4) 2 + (1/4) 2 = 3/8 = 0.375. 

Question no. 15 

We have: 


P[{X < 5} n {Y < 2}] = P[{X < 4} H {Y < 1}] = P[Y = 1] 
= 0.1 + 0.1 + 0.2 = 0.4. 


Question no. 16 

We first calculate 



(2 — xi— x 2 ) dx 2 



if 0 < x\ < 1. 


By symmetry, we can then write that 


fx 2 (x 2 ) 



if 0 < x 2 < 1. 
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Next, we calculate 


E[Xi] = [ : 
Jo 


= / Xj (- Xj ) dxj = — for i = 1 , 2 


12 


and 


E[XiX 2 \ = [ [ xix 2 (2 - x 1 - x 2 ) i 
Jo Jo 

It follows that COV[X 1 ,X 2 ] = £[XiX 2 ] - ,E[Xi]£[X 2 ] = - -0.0069. 


— x\ — x 2 ) dx\dx 2 = 


Question no. 17 

We first find that 


y 

1/2 1 

Pv(y ) 

2/3 1/3 


which implies that 


W 

-1/2 0 1/2 

Pw{w) 

2/9 5/9 2/9 


because P[W = -1/2] = P[Y 1 = 1/2, Y 2 = 1] = (2/3) (1/3) = 2/9, and so on. 

Question no. 18 

We can write that Y17-1 ~ N(n(l/2), rz(l/4)), because //jq = 1/2 and =1/4, 

for all i. Then, 




i=i 


~ P 


n -I _ n 

N(0 - I)>1 757i i 


<Z> [ -/=) ~ 0.5398 Ta <£=P' 3 JL ~ o.l 

/n / Jn 


~ 0.4602 

n 400. 


Question no. 19 

(a) We have that X\ — X 2 ~ N(0 — 0, 25 + 25) = N(0, 50). We then calculate 


P[Xi -X 2 > 15] 


N(0,1) > 


15-0 


\/50 . 


~ 1 -£( 2 . 12 ) 


Tab ~ B ' 3 1 - 0.9830 = 0.0170. 


(b) By independence, we may write that 

fx 1 ,x 2 (xi,x 2 ) = fx 1 (xi)fx 2 (x 2 ) 


1 f (x \+ x\) 

exp 3 


(2*0(25) 


50 


for (xi, x 2 ) G M 2 . 
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(c) (i) We have that P[X i = 2 | X\ > 1] = 0, because X\ is a continuous random 
variable. 

(ii) Because {X\ = 1} C {X\ < 2}, we can write that P[X l < 2 | X\ — 1] is equal 
to 1. 

Question no. 20 

Let Xi be the length of the ith section, for i = 1,..., 100. By the central limit 
theorem, we may write that 


P 


100 

970 < Xi < 1030 


P [970 < u 2 ) < 1030] , 


where // = 100 x 10 = 1000 and <r 2 = 100 x 0.9 = 90. Thus, we seek approximately 
1 - [<£(3.16) - <£(—3.16)] Tab ~ B ' 3 l - 2 x (0.9992 - 1) = 0.0016. 

Question no. 21 

(a) We have: 


fx(x) = f i 
Jo 


,2-x 


3x 2 e x y(l — y) dy = 3x 2 e 


= 3 x 2 e- (£-£ 


G — x 


f y {i - y) dy 

Jo 


if X > 0. 


We find that X ~ G(a = 3, A = 1). 
Next, we calculate 


POO POO 

fv(y) = / 3x 2 e~ x y(l - y)dx = 3y(l - y) / x 2 e~ x 
Jo Jo 

= 3y(l - y)r(3) = 6y(l - y) if 0 < y < 1. 


dx 


In this case, we find that Y ~ Be(o = 2, f3 = 2). 

(b) We can check that fx,Y(x,y) = /x(^)/v(^/)- Therefore, by definition, X and Y are 
independent random variables. 

(c) We have that VAR[X] 3/(l) 2 = 3. Next, we calculate 

POO -1 1 

E[X k ] = J -x k+2 e~ x dx = -r{k + 3) 


for k = 1, 2,... . Then, we have: 
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_ E[X 4 ] - 4 E[X 3 }E[X] + 6 E[X 2 ](E[X}) 2 - 4 E[X](E[X]f + (. E[X}) 4 
^ ~ (VAR[X]) 2 



360 - 720 + 648 - 324 + 81 45 _ 

-9-= ¥ =5 - 


(d) First, we calculate 

E[Y] = J y- 6y(l - y) dy = 6y 2 (l - y) dy =s 6 Q - 0 = ^. 

Note that 

fy (h~y) - fy (h + v) 

for all y G (0,1/2). That is, the function fy(y) is symmetrical about the mean of Y. 
Because all possible values of the random variable Y are located in a bounded interval, 
we can then assert that /?i = 0. 

Question no. 22 

(a) Summing the elements of the columns and the rows of the table, respectively, we 
find that 

and 


(b) We have, in particular, that Px{Q)py{~ 1) = | x § 7^ \ = Px,y{ 0, —1). Therefore, 
X and Y are not independent random variables. 

Remark. Because there are some 0s in the two-dimensional table, X and Y could not be 
independent. Indeed, a “0” in the table will never be equal to the product of the sum 
of the elements of the corresponding row and column. 

(c) (i) By definition, we have: 


y 

-1 0 1 

Pv{y) 

2/9 4/9 1/3 


X 

0 1 2 

Px{x) 

1/3 1/3 1/3 


Pv{y I x = l) = 


pxYXy) 

Px{ l) 


- 3px,y(l,J/) 


0 if y = -1, 
0 if y = 0, 

1 if V = 1- 


That is, T | {X = 1} is the constant 1. 

ii) We first find that P[X < 1] | + | = |. It follows that 

py (-1 \x<i) = \p[{y = - 1 } n {X < l}] = + o) = / 


Likewise, we calculate py (0 | X < 1) = | (| + 0) = |. Thus, we have: 
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y 

-1 0 1 

X 

pv(y 1 v < l) 

1/6 1/3 1/2 

1 


(d) We first calculate 

E[X] ( = } 1 x 1 +2 x 1 = 1, E[Y] ( = } -1 x ? + 1 x t t 

and 

E[XY] = 1 x 1 x 1 + 2 x (-1) x t = t. 

It follows that 

COV[V, Y] = E[XY] - E[X]E[Y] = t - 1 x 1 = 0 

and, consequently, CORR[V, Y] = 0 (because STD[X] and STD[1'] are strictly positive). 

(e) We have: 

f 0 if (X, Y) = (0, —1) or (0,0), 

W = < 1 if {X,Y) = (0,1) or (1, -1) or (1,0) or (1,1), 

I 2 if (X, Y) = (2, -1) or (2,0) or (2,1). 


Making use of the function px,y(x,y ), we find, by incompatibility, that the function 
pw{w) is given by 


w 

0 1 2 

E 

Pw(w) 

1/3 1/3 1/3 

1 


Question no. 23 

We have: 


P[X + R<4|X<2] = 


P[{X + Y < 4} n {X < 2}] 


P[X < 2] 


p[(x,y)e{(i,2),(i,3),(2,2)}] _ ^ + 


i +1 
6 ' 6 


1 _ (l2 + \ +°) 


- =0.5. 


Question no. 24 

Let X be the number of times that the digit “7” appears among the 10,000 random 
digits. Then, X ~ B(n = 10, 000 ,p = 0.1). We seek 


P[X > 968] 


P[X > 969] ~ P 


N(0,1) > 


969 - 0.5 - 1000 
x/900 


Q(—1.05) = ^(1.05) Ta ~ B ' 3 


0.8531. 
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Question no. 25 

(a) We have: 


E[X r Y s }= [ [ x r y s -dydx= [ x ^ 1 

Jo Jo x Jo 5 + 1 

rl rr.r+s 

= l~, TT 

(b) We calculate 


dx = 


and 


(s + 1 )(r + 5 + 1) 

E[X 2 } = t 1 

Jo 

E[X 2 Y°] = 


dx 


for r, s = 0,1, 2,... . 


= / x 2 • 1 dx = - 
o 3 

1 


1 


(0 + 1)(2 + 0 + 1) 3' 


(c) We deduce from part (a) that E[X\ = E^Y 0 } = \, E[X 2 ] = E[X 2 Y°] 
E[Y] = E[Y 2 } = |, and ^[17] = |. Then, we have: 

V A R|X]=i-(i) 2 = i VAR[Y] = i - (/Y = jlj 


1 

3 ’ 


and 


Question no. 26 

(a) We have: 


Px,y 


l _ l l 
6 2 * 4 

12 ' 144 


- - 0.6547. 


l -=M-l\X = 2) = P ^-l)_ 1/16 


Px{ 2) P[V = 2] 


C[X = 2] = 


(b) We first find that 

Px,y(2,0) = P r(0 | V = 2) PJC (2) ( = } (3/8)(l/2) = 3/16. 

Likewise, we have that Px,y{ 2,1) = (1/2)(1/2) = 1/4. We then obtain the following 
table: _ 


y\x 

0 1 2 

Py(v) 

-1 

1/16 1/8 1/16 

1/4 

0 

3/16 1/8 3/16 

1/2 

1 

0 0 1/4 

1/4 

Px(x) 

1/4 1/4 1/2 

1 


(c) We can check that W follows a binomial distribution with parameters n = 2 and 
p= 1/2, because [see Py{v) in the above table] 
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Question no. 27 

(a) We have: 


E 


Pw{w) = (^) (1/2) 2 for w = 0,1, 2. 


[ [ -°^-dydx= f -dx = 1 . 

Jo Jo X H 2 J o 2 


17 


(b) We first calculate (see Figure 2 !. 2) 


Then, 


fx(x) = J o Y dy= ~A for 0 

E[X 2 ] = [ 

JO 


< x < 2. 




= / x z —dx= — 


= 3^ 2 - 67 ' 



Fig. C.2. Figure for solved exercise no. 27. 


(c) Making use of part (b), we may write that we seek the number for which 


/ 


x s 7 1 

— dx = - 
4 2 


16 


~ 1.68 


because the median x m must be positive, in as much as X G (0, 2). 

Question no. 28 

We seek 


P[{X < 1} D {Y < 1}] = P[X < 1 ]P[Y < 1] - 0.3402 
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because 


and 


P[X < 1] 
P[Y < 1] 



-e~ x ' 2 dx = 1-e” 1 / 2 
2 

4 ye~ 2y dy = 1 — e -2 . 


Question no. 29 

Let Xk be the fcth number taken at random. We have that E[X^] = 1/2 and 
VAR[X fc ] = 1/12, for all k. Then, by the central limit theorem, we may write that 

P[45 < S < 55] ~ P[45 < N(100(l/2), 100(1/12)) < 55] 

~ P[—1.73 < N(0,1) < 1.73] = <£(1.73) - <2>(-1.73) 

= 2 #(1.73) - 1 Tab ~ B ' 3 2(0.958) - 1 = 0.916. 


Question no. 30 

(a) Let Xi be the number of floods during the ith year. Then, Xi ~ Poi(o = 2) Vi and 
the XiS are independent. We seek 


50 


J2 x i > 80 


. 2=1 


CLT 


- P[N(50(2), 50(2)) > 80] = P 


N(0,1) > 


80 - 100 


10 


= 1 - <£(-2) = <£(2) Tab ^ B ‘ 3 0.9772. 


(b) Let Yi be the duration of the ith flood. Then, Yi ~ Exp(A = 1/5) Vi and the Y^s are 
independent random variables. Because E[Yj] = 1/A = 5 and VAR[1^] = 1/A 2 = 25, we 
calculate 


P 


" 50 

Y Yi < 200 
_2 = 1 


~ P[N(50(5), 50(25)) < 200] ~ P[N(0,1) < -1.41] 

Tab ~ B ' 3 1 - 0.9207 ~ 0.079. 


Chapter 5 


Question no. 1 

We have: 


R{x) := P[X >x}= P[Y 1 >x] = 


_ / P[Y > y/x\ = 1 


s/x if 0 < x < 1, 

if x > 1. 
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Question no. 2 

We can write that Y \ {X < 2} ^ Exp(l) and Y \ {X > 2} ~ Exp(l/2). It follows 
that 

E[Y] = E[Y | X < 2 }P[X < 2] + E[Y \ X > 2 }P[X > 2] 


= 1 x 


l_ e -(l/2)2l + 2xe -(l/2)2 = 1 + e -l. 


Because E[X] = 2 years, we have that MTBF = 731 + e 1 ^ 731.37 days 

Question no. 3 

First, we calculate 


F T {t) = f T (s)ds = t eX p j- 6 - 1 +s| 


ds 


l t' exp {- i T i }‘ is = - <ixp {- i T i } 


It follows that 


Question no. 4 

We calculate 


and 


= 1 — exp < — 


rit) = 


for t > 0. 


hit) e' 


1 — EV’(t) A 


= — for t > 0. 


"( 0 ) = 


0 -\ 


0 - A 


A 


r( 1) = 


ET= 0 e- A AVi! 1 

e _A A e -A A 


EZi e ~ XX j /j' e _A (e A — 1) e x — 1 


We have: 


r(0) < r( 1) 


e“ A < 


A 


Let 

Because g(0) = 0 and 


e x — 1 
^(A) = A + e A — 1. 

</(A) = 1 — e _A > 0 for all A > 0, 


1 - e“ A < A 


we may assert that g( A) > 0, for A > 0. Hence, we conclude that r(0) < r( 1), so that 
the failure rate function is increasing at k = 0. 
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Remark. Actually, it can be shown that the function r(k) is increasing at any k E 


{ 0 , 1 ,...}. 
Question no. 5 


We have: 



R{x) := P[X >x] = 1 dt = 1 — x if 0 < x < 1. 


Therefore, 


ln[R(0)] - ln[i?(l/2)] 

1 - ( 1 / 2 ) 


AFR( 0,1/2) 


2 [In 1 — ln(l/2)] = 2 In 2. 


Question no. 6 

The components are connected in series, therefore we indeed seek the probability 
P[X 2 < X±\. Making use of the above formula, we can write that 



By independence, we obtain that 



P[X 2 <X 1 }= I [1 — e~ X2Xl ]\\e~ XlXl dx\ = 1 — Aj / e~ {Xl+X2)xi dx i 



Remark. Note that P[X 2 < X\\ = 1/2 if Ai = A 2 , which is logical, by symmetry (because 
P[X 2 = Xi\ =0, by continuity). 

Question no. 7 

We have that P[T > to] = P[{T\ > to} U {T 2 > to}]. Let = {T^ > to}, for 
k = 1,2. We seek 



because {A\ D A 2 } C {A\ Ui 2 }- 
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Next, we have: 

P[Ai n A 2 ] ! =' P[Ai\P[A 2 \ = e“ (Al+A2) ‘° 

and 


P[A X U A 2 ] = P[Ai] + P[A 2 ] - P[A X n A 2 ] = e~ Xlt ° + e~ X2t ° - e~ {Xl+X2)t °, 


so that 


P = 


g—(Ai+A2)to 

g—Aito _|_ g—A2to — g—(Ai+A2)to 


Question no. 8 

Let F = “the device operates at the initial time.” In the case of a series device 
constituted of one brand A and one brand B component, we have that P[F] = (0.9) 2 = 
0.81. Therefore, the probability that at least one of the two devices works is given by 
1 - (1 - 0.81) 2 = 0.9639. 

If we build a single device as described above, we have: 

P[F] = (1 - (0.1) 2 )(1 - (0.1) 2 ) = 0.9801. 

Thus, it is better to duplicate the components rather than to duplicate the devices. 

Remark. The conclusion of this exercise can be generalized as follows: it is always better 
to duplicate the components in a series system than to build two distinct systems. 

Question no. 9 

We have: 

H{pci,x 2 ,x 3 ,x 4 ) = va&x.{x 1 x 2 ,x 3 }x 4 = [1 - (1 - x\x 2 ){l - x 3 )]x 4 

= (xiX 2 + X 3 ~ XIX 2 X 3 )X4. 


Question no. 10 

The minimal path sets of the system are the following: MP\ = {1, 2,4} and MP 2 = 
{3,4}. Hence, 


7Ti(xi,X2,X3,X 4 ) = XxX 2 X4 and 7T 2 (xi, x 2 , x 3 , x 4 ) = X 3 X 4 , 


so that 


H(x 1 ,X 2 ,X 3 ,X 4 ) = 1 - (1 - XiX 2 X 4 )(l - x 3 x 4 ). 


Remark. Because Xk = 0 or 1 Vfc, we may write that x\ — x\~. It follows that 


H(x 1 , . . . , X 4 ) = X\X 2 X 4 + X 3 X 4 — X\X 2 X 3 X^ = X\X 2 X 4 + X 3 X 4 — X\X 2 X 3 X 4 
= {x\X 2 + x 3 - X\X 2 X 3 )x 4 , 


which agrees with the result in the previous exercise. 
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Chapter 6 


Question no. 1 

(a) Let Tfc, for k = 1 , ...,n, be the lifetime of component no. k. If the components 
are connected in series, then, by the memory less property of the exponential distri¬ 
bution, we may express the time T between two system breakdowns as follows: T 
= min{Ti,..., T n }. From the remark after Proposition 5.2.1, we deduce that T ~ 
Exp(/ii + ••• + fjb n ). Hence, we may assert that {N(t),t > 0} is a continuous-time 
Markov chain. Actually, it is a Poisson process with rate A = /ii + • • • + fi n . 

(b) When the components are connected in parallel, we have that T = max{Ti,..., T n }. 
Now, the maximum T\^ of two independent exponential random variables does not have 
an exponential distribution. Indeed, we can write (see Example 5.2.2) that 

P[J 1,2 < t\ = (1 - e~^ 1 ) (1 - e~^ 2t ) for t > 0, 


so that 

h i, a (t) = <t}= - (/*i + f i 2 )e-^ + ^ t for t > 0. 

Because we cannot write /ti, 2 W f° rm 

/ti j2 (^) = Ae _At for t > 0 

for some A > 0, we must conclude that Tp 2 is not exponentially distributed. By ex¬ 
tension, T is not an exponential random variable either, so that {7V(£), t > 0} is not a 
continuous-time Markov chain. 

(c) Finally, in the case when the components are placed in standby redundancy, the 
random variable T is not exponentially distributed either (for n > 2). Indeed, if fik = Ab 
then [see (4.9)] T ~ G(n,/i). Because T does not have an exponential distribution 
in this particular case, it cannot be exponentially distributed for arbitrary s. Thus, 
{N(t),t > 0} is not a continuous-time Markov chain. 

Question no. 2 

We can write that X(£), given that X(0) = i, has a negative binomial distribution 
with parameters r = i and p = e~ xt [see (3.2)]. From Table 3.1, p. 89, we deduce that 
the expected value of X(t) is given by 

E[X{t) | X(0) =i] = r -= ie xt . 

To justify this result, notice that 

p hj (t) = e~ xt (l - e-^y- 1 for j > 1. 
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That is, 

P[X(t) = j I X(0) = 1] = P[Geo(p := e~ xt ) = j] for j = 1,2,... . 

Now, when X(0) = i > 1, we can represent X(t) as the sum of i (independent) 
geometric random variables with common parameter p. Then, by linearity of the math¬ 
ematical expectation, we deduce that 


E[X(t) | V(0) = *] = * x E[Geo(p := e~ xt 



= ie 


A t 


Question no. 3 

The balance equations of the system are the following: 

state j departure rate from j = arrival rate to j 


A7t 0 = fini 

(2A + /i) 7Ti = A7T 0 + 2(1112 
2(171 2 = 2A7T1 


We deduce from the first and the third equation that 


A , A 

IT I = — 7Tq and 7T2 = — 7Ti 
(1 (1 


7T 0 . 


Hence, we can write that 


A 


7T 0 + -7T 0 + - 7T 0 = 1 


P 


TTO = 


_ A 

^ A \ 2 

1 + - + 

- 


w 


-1 


from which we obtain the values of i\\ and 7T2- 


Question no. 4 

Let irl = P[X(to) = k \ X(to) G {2,3,4}]. We can write that 


7T/c 


r>k — 2 


irl = --- = -o- t = --y for k = 2,3,4. 

7T 2 + 7T3 + ^4 p 2 + p 3 + p 4 1 + p + p 2 

Next, let Q* be the waiting time of the new customer. By the memoryless property 
of the exponential distribution, we can write that 


4 

E[Q* | X(to) € {2,3,4}] = £ £[<2* | (X(i 0 ) = fc} n {X(t„) G {2,3,4}}] 

k =2 

x -P[V(f 0 ) = k | V(f 0 ) G {2,3,4}] 

— \ f h\ * _ 1 ( 2 + 3p + 4p 2 N \ 

_ i^w fc_ MV 1 + P + P 2 J' 
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Question no. 5 

The balance equations are: 

state j departure rate from j = arrival rate to j 


0 

1 

2 

n € {3,4,...} 
Next, we calculate 


A 7 T 0 = flTTi 

(A + /i)7Ti = A7T 0 + /i7r 2 
(A + //) 7T 2 = A7 Ti + 2//7T3 
(A + 2/i)7T n = A7T n _i + 2/i7T n+ i 


n k = 


AAA • • • A 


2 / A \ k-2 


-) for k — 2, 3,. 


MM( 2 m) ' •' (2m) 

Hence, we deduce that the sum H k converges if and only if 


E 77 * 


k =2 


< oo 


2 oo 




k =2 


k=2 
k-2 

< OO 




A 


k—2 


< OO 


A<i, 

2 ii 


as we could have guessed. 

Question no. 6 

The limiting probabilities of the system are given by [see (6.17)] 

V = —T~ — for J = l ' 2 - 

1 — p 6 

Let F be the event “the customer who arrived at time to + 2 was able to enter the 
system.” We can write that 


P[F] = P[F | X(ti) = l]P[X(ti) = 1 | Xfo) G {1,2}] 

+ P[F | X(to) = 2 \P[X(to) = 2 | Xfo) G {1,2}] 


= 1 x 


7Tl 

7Ti + 7T 2 


+ P[F | X(fg ) = 2] ——p— 

7Ti + 7T2 


Now, given that the system was full when a departure took place at time to, the 
customer who arrived at to + 2 was able to enter the system if and only if the one who 
started being served at to left the system before time to + 2. That is, if S rsj Exp (fi), 


P[F | X(to) = 2] = P[S < 2] = 1 - e~ 2lx . 
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Hence, the required probability is 


P[F] = 


P + P‘ 


+ (! 


- p- 2 ^ 


1 


p + p 2 1 + p 


[l+p(l 


— e~ 2 ^ 


)] 


Question no. 7 

When X a = X and the service rate is p = A, the limiting probabilities of the system 
are [see (6.16)] 

T? = J for j = 0 , 1 , 2 ,3. 

The average entering rate of customers into the system is A e = A (1 — 7 * 3 ) = 3 A/4, so 
that the average amount of money that the system earns per unit of time is equal to 
$x x 3A/4. 

Next, if A* = A /2 and p* = 2 p, then p* = p/4 = 1/4 and [see (6.17)] 


(l/4p [1 - (1/4)] 
1 — (1/4) 4 


for j = 0,1,2,3. 


It follows that 


A* = A* (1 — 7r*) = 




126A 

255 


and, because the entering customers pay the same amount as before, we conclude that 
the server is better off to serve at rate p. Indeed, we have: 


$x x 


3A 

T 


> $X X 


126A 
255 ' 


Question no. 8 

By the memory less property of the exponential distribution, we can write that Wi := 
Ti — to is an exponentially distributed random variable with parameter p. Indeed, we 
have: 

P[Wi >t]= P[Si - {to - ti) > t \ Si> to - ti] for * = 1 , 2 , 


where S', is the service time of the customer being served by server no. i and t; > 0 is 
its (known) arrival time. Because Sj is an Exp(// j random variable, we can write that 


P[Si - {t 0 - U) > t | Si > t 0 - U] = P[Si > t + {t 0 - U) | Si > t 0 - U] 

= P[Si >t] = e-^. 
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Hence, by independence, the required probability is given by 


P[t 2 — to < Ti — to + 1] = P[W 2 < Wi + 1] 

POO PS 1+1 

= P[S 2 < Si + 1] = / / ne-^ Sl iie-^ds 2 ds i 

JO jo 




l)/je~^ (si+S2) 

S 2 — Si +1 

dsi = 

POO 

/ M 

e _ e -/x(2si + l) 


Co 

to 

II 

0 

Jo 

L J 


ds i 



Remark. Because, by symmetry and continuity, P[*Si < £ 2 ] = 1/2, we deduce that 

P[5a < 5 2 < 5! + 1] = 1 - ^ (l - e""), 

from which we retrieve the result in Example 6.3.1. 

Question no. 9 

Because the service rates are not necessarily equal, X(t) cannot simply be the number 
of customers in the system at time t. We define the states: 


0: the system is empty; 
li: only server no. 1 is busy; 

I 2 : only server no. 2 is busy; 

2: the two servers are busy and nobody is waiting; 

3: the two servers are busy and somebody is waiting. 


We have: 


state j departure rate from j = arrival rate to j 

0 AtT 0 = /ilTTi, +/i 2 7Ti 2 

ll (A + /li )7Ti 1 = \l T 0 + /127T2 

h (A + /i 2 )7Ti 2 = /il7T 2 

2 (A + /ii + 112)^2 — A(7Ti 1 + 7Ti 2 ) + (/ii + ^ 2)^3 

3 (mi +/i2)7T 3 = AtT 2 
The value of TV is given by 

TV = 7Ti 1 + 7 Ti 2 + 27T 2 + 37T 3 . 

Moreover, the average entering rate of customers is A e = A(1 — 7r 3 ). It follows, from 
Little’s formula, that 

j, _ TV _ tt 1i + 7Ti 2 + 2tt 2 + 3tt 3 
A e A(1-7T 3 ) 
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Question no. 10 

We have an M/G/2/2 queueing system for which S ~ U(2,4), so that E[S] = 3. 
Making use of (6.26), we can write that 


(E (3A) * 


-1 


\k=0 


k\ 


1 + 3A 


(3A) 2 


-l 


and 


rl ( 3A ) 2 

7 Ti = 3A7 Tq and 7 T 2 = - ttq. 


If A = 1/3, we have: 


7T 0 = 1 + 1 


-l 


2 2 , 1 1 
= -, 7Ti = ttq = - and 7r 2 = -7r 0 = -. 


It follows that 


E[N] = lX7Ti+2X7T2 = 


4 

5 


and E[N 2 ] = 1 x7Ti+4x7 T2 = 


so that 


VAR[V] = | 
o 



14 

25 


Chapter 7 


Question no. 1 

We have that 

E[Y n ] = | {E[X n } + E[X n ^}} = 0. 
Moreover, for m e {— n + 1, — n + 2,...}, 


£[F n F n+m ] = E 


x n + X n i 


X, 


n+m 


+ V 


n+m—1 


— - {E[X n X n+rn ] + £’[X n X n+m _i] 

H - ^[Wi-l-^n+m] H - ^[^n—l-^n+m—l]} • 


In the case when m = 0, we obtain that 

E[Y n Y n+m ] = E[Y*} = 1 {£?[**] + 0 + 0 + 


Or | O 
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Remark. Or, because E[Y n ] = 0: 

E[Y%\ = VAR[F„] '=' 1 {VAR[X„] + VAR[X„_!]} = y. 


Next, if m > 0, we calculate 

E[Y n Y n+rn ] = - {0 + E[X n X n + m - i] + 0 + 0} = -E[X n X n + m -i\ 

f 0 if 771 ^ 1, 

[ \(J 2 if 771 = 1. 

Similarly, when m < 0, we find that 


E[Y n Y r 


n+ra 



0 if 771 ^ —1, 
\o 2 if m — — 1. 


Hence, we can write that 


cov[y n ,r n+m ] = { 


( \<J 2 if 777 = 0, 

\(J 2 if 771 = d=l, 

0 ifra^0,±l, 


so that COV[F n ,F n+m ] = 7 ( 777 ) = y(—m) (and E[Y n ] = 0). Thus, the process {T n ,n = 
1,2,...} is weakly stationary. 

Question no. 2 

Let 

Xi = - (Yi - bi ) := /ii(Fi,F 2 ) for i = 1, 2. 

ai 

We have that 

? =—— 7^0 V (2/1, 2/2)- 


Because the partial derivatives dhi ( 7 / 1 , 7 / 2)5 for = 1,2, are all continuous 
V ( 7 /^ 5 2 / 2 ) 5 we can write that 


dfri(2/i>2/2) ^1(771,772) 
<9771 dy 2 

<9^2(771,772) <9/12(771,772) 
< 9 ? 7 i dV2 



h), —(V2 ~ b 2 ) 
&2 


1 

aia 2 


/Yi ,>2 ( 2/1 j 2 / 2 ) = fx i,x 2 
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By independence, 


fx u X 2 (x 1,X 2 ) = (:E ' + ^ )/2 = ^ e X P {-l(xf + xi)J 


V(xi,x 2 ) Gl 2 . It follows that 

1 


fY 1 ,Y 2 (yi>y2) = 


aia 2 


^ exp <{ -1 f 3(2/1 - &1) 2 + 3(2/2 - b 2 )‘ 


V(3/1,2/2) e M 2 . 

Remarks, (i) From Definition 7 . 1 . 6 , we can assert that the random vector (Yi,Y 2 ) has 
a bivariate normal distribution. Moreover, we have that 

E[Yi] = b it VAR[U] = a 2 


and 


COV[Fi,r 2 ] *=' a ia2 COV[X u X 2 ] + 0 = 0 . 

s -v-' 

0 

Hence, we can write that m = (&i, fe 2 ) and 

[ af 0 
[ 0 0,2 _ 

Finally, Equation ( 7 . 2 ) yields that 
/yi,y 2 (2/i,2/2) = {( 27 r) 2 afa^} _ 1 / 2 exp|-l( 2 /i - 61,2/2 - 6 2 )C _1 
where 


yi ~ h 
V2 ~ b 2 


1_ 1 

«2 0 



a 2 a 2 

0 a\ 


. 0 4. 


It is a simple matter to check that the function fy 1 ,Y 2 (yi, 2/2) above is the same as the 
one obtained from Proposition 4 . 3 . 1 . 

(ii) We can also use the fact that Y\ ~ N(bi,af) and Y 2 ~ N(6 2 ,a2) are independent 
random variables to get /y 1? y 2 (2/1, y 2 ). Indeed, their covariance being equal to 0 (see the 
preceding remark), we can assert that they are independent (because they are Gaussian 
random variables). 

(iii) If ai = 0 , then Yi = bi is a degenerate Gaussian random variable. Its probability 
density function is given by 


fy 1 (2/1) = <5(2/1 - 61) V2/1 G E, 
where £(•) is the Dirac delta function defined by (see p. 4 ) 
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6 (x) = 


0 if x 7^ 0, 

oc if x = 0 


/ oo 

S(x)dx = 1. 

-oo 


Question no. 3 

Equation (7.18) yields 

VAR[X n ] = 7(0) = 077(1) + 027(2) + 037(3) + a 2 (*) 
and we deduce from (7.13) that 

7(1) = 077(0) + 027(1) + 037(2), 

(2) 

7(2) = 017(1) + 027(0) + 037(1), 

7(3) = 017(2) + 027(1) + 037(0). 


Substituting (3) into (*), we obtain that 

7(0) = 017(1) + o 2 7(2) + o 3 [017(2) + 027(1) + 037(0)] + a 2 

=> 7(0) = 1 2 [(“1 + 02« 3 )7(1) + («2 + 0703)7(2) + CT 2 ] . 

1 O3 

Next, (1) and (2) imply that 


7(1) = Oi7(0) + <227(1) + <^3 [(«1 + <23)7(1) + <227(0)] 

/ n \ J a l + <^2<23 

7(1) = 7(0) 


(r 


— <22 — (< 2 | + < 23 ) 0^3 
Then, making use again of (2), we find that 


7(2) = 7 (0) 

Hence, we can write that 


f <22(1 — <22) + <2i(<2i T <23) 1 
\ 1 — <22 — (<2i + <23)^3 / 


(1 - « i ) 7 ( 0 ) = + l ai + ° 2 ?! w 7(0) 

1 — <22 — (<2i + <23 )<23 


+ 


(<22 + <21(23) 


1 — <22 — (< 2 i + < 23 ) 0^3 


[<22(1 — <22) + <2i(aq + <23)] 7(0). 


Thus, we have: 


7 ( 0 ) = a 2 


I-03 


(01 + O2O3) 2 + (02 + O1O3) [02(1 — O2) + Oi(oi + 03)] 
1 — 02 - (01 + 03)03 









C Solutions to “Solved exercises’ 


319 


Remark. In the case when aq = ay = 03 := cq we find that 


7(0) = <x 2 


{ 


1 — a — 2a 2 

1 — a — a 2 + 5<a 3 + 4<a 4 


}■ 


Question no. 4 

We deduce from Equation (7.5) that X n = a i _ * e i 5 which implies that 


E[Y n ] = E 

n 

ind. 

i=1 


exp 


ki=l 


= E 


n s 2 

II exp {y“i 


JJexp{a" *ei} 
1 

2(n-i) 


u=i 
2 






i —2i 


Now, define 


S n — ^ ^ oq 2 ^ — oq 2 + aq 


-4 




i=l 


—2n 


i= 1 


We have [see Equation (7.7)] that 


<? _ 1 -■ a l 


— 2 n 


a( - 1 


Therefore, we can write that 



Remark. If {X n ,n G Z} is a zero mean weakly stationary AR(1) process and Y n = e x 
for all n G Z, then [see Equation (7.9)] 


E[Y n ] = E 

l 

CD 

X 

M7 

C* 

3 

= E 

~ oo 

IJ exp {a\e n _i} 

oo 

ind. 



l i =0 ) _ 


_i =0 

i =0 





where we assumed that we can interchange the mathematical expectation and the infi¬ 
nite product. 
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Question no. 5 

Equation (7.23) (with 0q = 1) yields that 


/-j x _ 2i=() ®i®i +1 _ ^1 + ^1^2 + #2 #3 

p[ J ~ i + 02 + 02 + 0$ ~ i + e 2 + o 2 + e 2 ’ 

nm = Eto 2 flA+2 = 92 + OlOs 

J i + e\ + el + el i + oi + oi + ol 

(o\ _ Si=0 0j0i +3 _ _^3_ 

^ ^ 1 + 9\ + ^ + 02 i + 02 + 0 2 + ' 


Question no. 6 

The function p(2) is given by [see (7.25)] 


p(2) “ i + el + el 


(a) When 6\ = 1, we set 


g(x) = 


2 + x 2 


and we calculate 


It follows that 


//x 2 — x 2 

g (x) = ——oVo = o 


(2 + x 2 ) 2 

V2 


= ±V2. 


T S * 2 > S T' 


(b) In the general case, let 


We have that 


and 


ff(x,y) = 
-2 xy 


y 


dy_ = _ 

dx (1 + x 2 + y 2 ) 2 
dg 1 + x 2 — y 2 


1 + x 2 + y 2 
= 0 x = 0 or y = 0 


= 0 


y 2 = x 2 + 1. 


dy (1 + x 2 + y 2 ) 2 

When y — 0, p{ 2 ) is equal to 0, which is neither a maximum nor a minimum. Hence, 
the extrema of the function g(x,y) are attained at (x,y) = (0, —1) and (0,1). Because 

5 ( 0 ,- 1 ) = —| and 5 ( 0 , 1 ) = 1 , 
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we conclude that 


~\ </>(2) < 


1 

2’ 


as the function p( 1) in the case of an MA(1) process. 

Question no. 7 

We can write that 


X n — OL\X n -\ + e n + 0\e n -i + 02e n —2 Vn G Z. 

First, we calculate 

7(0) = E[X^\ = F[(ai J n _i + e n + 0ie n _i + 02e n -2)AT n ] 

= 017(1) + E[e 2 ] + 0i ,F[e n _iX n ] + 02-£'[e n -2^n]* 


Next, we have that 

,F[e n -iX n ] = oiF7[e n _i X n _i] + E[e n -ie n \ + 0i E[e‘^ l _ 1 ] +02 E[e n - ie n _ 2 ] 

0 a 2 0 

= OiF;[e n _i(oiX n _2 + e n _i + 0ie n _2 + 02^-3)] + 0icr 2 

= oiF/[e 2 _i] + 0i<7 2 = <j 2 (oi + 0i). 

Similarly, we calculate 

E\e n - 2 X n ] = oi_E[e n _ 2 X n —i\ + E[e n - 2 e n \ + 0i E[e n - 2 e n -i] + 02-F[e 2 _ 2 ] 

0 0 (j 2 

= aiE[e n -2 (aiX n _2 + e n _i + 0\e n -2 + $ 2 ^- 3 )] + 02^ 2 
= 01 {^[e n _20iX n _ 2 ] + 0i cr 2 } + 02<y 2 
= Oil {oi^[e 2 _ 2 ] + 01 CT 2 } + 02CT 2 = (7 2 (o 2 + Oi 01 + 02). 

Hence, we obtain that 

7(0) = 017(1) + a 2 + 0i (01 + 0i)cr 2 + 0 2 (o 2 + Q 7 01 + 02)cr 2 . 


Remark. We see that to obtain an explicit expression for 7(0), we must also calculate 
(at least) 7(1). 

Question no. 8 

For k = 1, we have [see (7.33)] that 

m ( 7 - 24 ) 0i 

«1) = P(1) = 

which agrees with the formula (7.35) in the case when k = 1. 
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When k = 2, we can write [see (7.34)] that 


m 


p(2)-p 2 (l) (7.24) 0-(9 1 /(l + e 2 1 )Y 


1-P 2 (1) l-(0i/(l + 02)) 2 ’ 

and the formula (7.35) with k = 2 becomes 

Mi) 2 


m = - 


i + e 2 + ef i + e\ + eY 


We indeed have that 

-(M 1 + ^)) 2 = -f? = el 

l _ { e l /{i + ei)f (i + elY-ei i + e 2 + ef 

Finally, in the case when k = 3, we first calculate 


D 


3,1 — 


i pO)p{ i) 
P( 1) 1 p(2) 
p{ 2) P(l) p(3) 


= M) - p(l)/o(2) - p(l)[p(l)p(3) - p\ 2)] + p(l)[p 2 (l) - p(2)] 
= y (1) - P 2 (1)p(3) - 2p(l)p(2) + p{l)p\2) + p{ 3) 


and 


7 ^ 3,2 — 


1 p(l)p(2) 

P(l) 1 P( 1) 

P(2)P(1) 1 


= 1 - p\ 1) - P( 1)[P(1) - P(1)P(2)] + p(2)[p 2 (l) - p(2)] 

= 1 — 2p 2 (l) + 2p 2 (l)p(2) — /j 2 (2). 


Because, from (7.24), p(fc) = 0 for |fc| = 2,3,..., we obtain that 
D 3A =p 3 (1) and £> 3,2 = 1 - 2p 2 (l). 


It follows that 


m 


P 3 (l) (7.24) (6>j/(l + ^)) 3 

l- 2 p 2 {l) 1 — 2 (6>i/(l +el)f 

e\ _ el 


(i + e 2 ) 3 - 2 e 2 (i + e 2 ) l + el + ef + ef’ 


which is the same formula as (7.35) with k = 3. 
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Question no. 9 

We find that y = —0.034. Next, we calculate 


k 

0 12 3 

7(fc) 

0.3364 0.0810 -0.0347 0.0202 


Hence, <j\ = j(0) ~ 0.3364. It follows that 


k 

1 2 3 

p{k) 

0.2408 -0.1032 0.0600 


Then, we calculate (see the previous question) 


and 


m 


4>(1) = p{l) ~ 0.2408, 

-0.1032 - (0.2408) 2 ^ 
1 - (0.2408) 2 


m 


(.2408) 3 - (,2408) 2 (.06) - 2(.2408)(-.1032) + (,2408)(-.1032) 2 + .06 
1 - 2(.2408) 2 + 2(.2408) 2 (—.1032) - (-.1032) 2 

0.143. 


Now, for an MA(1) time series with 0-\ = 1/2 and e n ~ U(—1,1), we have [see (7.21)] 
that 

4 = VAR[X n ] = (1 + el)a 2 = (l + i) 1 = _ 
and [see (7.35) and the previous question] 


m = 


(- 1 / 2 ) 

1 + ( 1 / 2) 2 


2 

3’ 


and 


m 


(~l/2) 2 

1 + ( 1 / 2) 2 + ( 1 / 2) 4 


4 

21 


-0.190 


m = 


(~l/2) 3 

1 + ( 1 / 2) 2 + ( 1 / 2) 4 + ( 1 / 2) 6 


8 

85 


0.094. 


Actually, the data are indeed observations of an MA(1) time series with 0\ = 1/2 and 
the e n s as above. However, we see that, as in Example 7.3.1, the number of observations 
is too small to obtain good point estimates of o\ and the </>(&)s. 


Question no. 10 

(a) Because, by assumption, the stochastic process {X n ,n G Z} is stationary, we can 
write that 
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E[X n \ = p and VAR[X ra ] = a\. 

Therefore, Equation (7.3) implies that 

(J x 

E\X n J r j | X n = X n \ = p + Px n ,X n+ j — (%n P) = P T p(j){%n P) 

= p(j)x n +p[l-p{j)\- ■ 

(b) We have that 

E[Xl +j | X n = x n ] = XAR[X n+j | X n = x n ] + (E[X n+j \ X n = x n }) 
( = 9) a%[ 1 - p 2 (j )] + {p(j)x n + p[l - p(j)}} 2 • 

Remark. We see that if /a = 0 and o\ = 1, then E[X n +j \ X n = 0] = 0 Vj and 
VAR[X n+j | X„ = 0] = | A'„ •• ()| = 1 p 2 (j ) for j = 1,2,... 


D 


Answers to even-numbered exercises 


Chapter 1 

2. F{pc) is continuous at any xGl, except at x = {0,1, 2}. At these points, it is only 
right-continuous. 

4. Discontinuous. 

6. xe~ x . 

8. f'(x) = |x -1 / 3 ; x = 0. 

10. V2tF. 

12. — ^e~ x (cosx + sinx). 

14. 3/8. 

16. - 
18. In 2. 

20. (a) p)z \ (b) k\(l-p) k p. 

Chapter 2 

2. (a) 7/12; (b) 55/72; (c) 2/9; (d) 3/8. 

4. (a) P[{u i}] = 0.1; P[{w 2 }] = 0.45; P[{w 3 }] = 0.0171; P[{w 4 }] = 0.8379; 

(b) (i) R = A! U {A\ n A 2 ) U (A^ n A% n A 3 ); (ii) 0.1621; (iii) 0.2776; 

(c) (i) B = Rl U R c 2 U R c 3 , (ii) 0.9957. 

6. 4/5. 

8. 0.9963. 

10. (a) 1/10; (b) 1/10. 

12. (a) 0.325; (b) 0.05; (c) 0.25; (d) 0.66375. 

14. 48/95. 

16. 23/32. 

18. 9. 

20. 63.64%. 

22. (a) 0.8829; (b) 0.9083. 
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24. (a) 0.9972; (b) (i) 0.8145; (ii) 0.13. 

26. (a) -; (b) (c) 0.8585. 

28. (a) 1/64; (b) 3/32. 

30. (a) 86,400; (b) 3840. 

Chapter 3 

2. (a) HypfA’ = 100, n = 2, d = number of defective devices in the batch); (b) 0.9602; 
(c) 0.9604; (d) 0.9608. 

4. (a) Poi(3); 0.616; (b) Poi(9); 0.979; (c) Exp(3); 1 — e -3 ; (d) Exp(3); e -3 ; (e) G(a = 
2, A = 3); (f) B(n = 100 ,p = 0.05); 0.037. 

6. (a) (i) B(20, 9); (ii) Hyp(1000, 20,10006»); (b) (i) 0.010; (ii) 0.032. 

8. (a) X - Poi(1.2); Y ~ Geo(0.301); Z ~ B(12,0.301); (b) 4th; (c) 0.166; (d) 0.218. 
10. 0.7746. 

12. 0.6233. 

14. 0.1215. 

16. 0.759. 

18. 0.4. 

20. 1 -e“ 5 . 

22. (a) 

f 1/2 if 0 < x < 1, 
fx(x) = < 1/6 if 1 < x < 4, 

[ 0 elsewhere; 

(b) 2.5; (c) 1.5; (d) oo; (e) (i) 1/2; (ii) 1. 

24. (a) 0.657; (b) 0.663; (c) 7/17. 

26. (a) 0.3754; (b) 0.3679. 

28. (a) 3; (b) 

( 0 if x < 0, 

Fx{x) = < 3a; 2 — 2a; 3 if 0 < x < 1, 

^ 1 if x > 1; 

( c ) (Ii) 2 *’ for = 1,2,...; (d) 

t ( \ _ / 3 (! _ V*) if 0 < z < 1, 

f z i z ) ~ S o elsewhere. 


30. (a) 6; (b) 7; (c) 0.8. 

32. 5/8. 

34. 0.00369. 

36. 0.027. 

38. 1 — e~ Xy if y = 1,2,... . 

40. 2.963. 

42. Two engines: reliability ^ 0.84; four engines: reliability ~ 0.821. So, here greater 
reliability with two engines. 
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44. (a) 0.75; (b) 0.2519. 

46. (a) (i) 0.7833; (ii) 0.7558; (b) 0.7361. 

48. (a) E[X] = (Trk/2) 1 / 2 ; VAR[AT] = 2k (l — J); (b) k is a scale parameter. 
50. (a) v^F/8; (b) E[e x2 ' 2 ] = 1.25; VAR[e* 2 / 2 ] = oo. 

52. (a) 



s — a — j3W if W > wo, 
r — a — (3W if W < wq] 


(b) Wo + 



Chapter 4 


2. (a) 3; (b) 


fx(x) = 


3x 2 if 0 < x < 1, 
0 elsewhere; 



|(1 - y 2 ) if 0 < < 1, 

0 elsewhere; 


(c) VAR[X] = 3/80; VAR[R] = 19/320; (d) 0.3974. 
4. (a) N(7506»,187.56» 2 ); (b) $220. 

6. (a) (i) 


fx{x) 


f (1 - X 2 ) if -1 < x < 1, , , v / I sjy if 0 < y < 1, 

0 elsewhere; J \ 0 elsewhere; 


(ii) 0; (b) no, because fx(%)fY(y) is n °t identical to 3/4 [= fx,Y(%iy)\- 

8. 1.5. 

10. 1/64. 

12. 0.1586. 

14. -1/11. 

16. 0.8665. 

18. (a) 


„ , v _ f 6x(l — x) if 0 < x < 1, * r \ _ [ 3y 2 if 0 < y < 1, 

Jx\x) | q elsewhere; J Y \y) y q elsewhere; 

(b) 5/16. 

20. (a) 0.5987; (b) 1.608; (c) 0.0871. 

22. (a) 0.6065; (b) 7.29; (c) 0.1428; (d) 0.3307; (e) 0.7291. 

24. (a) 0.2048; (b) 3/5; (c) e~ 3 / 5 - (d) 0.0907. 

26. (a) 0.6587; (b) 3.564. 

28. (a) 0.157; (b) 0.3413; X = where X t ~ Exp(l/2), for all i, and the A/s 

are independent random variables. 

30. 10/3. 





328 D Answers to even-numbered exercises 


32. 18. 

34. (a) (b) 

j, , x f -\J\ — x 2 if — 1 < x < 1, 

X X | 0 elsewhere 

and fyiy) = fx(y), by symmetry; (c) no, because f x (x) fv (v) is not identical to 1/ir 
[= fx,v(x,y)\ ; (d) 3/4. 

36. (a) 0.045; (b) 527.08. 

38. (a) T ~ Exp(l/5); (b) 


/tW = {^ (1 6 

(c) T ~ G(a = 10, A = 1/50); that is, 


t/50)9 e —i/50 if f > o, 

0 elsewhere; 


/t(*) 


1 

50 10 r(io) 

o 


t 9 e~ t/5 ° if t > 0, 

elsewhere. 


40. (a) 


(b) 0.1566; (c) 


(d) 1/2. 
42. (a) (i) 


fv(y ) = 


3 (y- l) 2 if 0 < y < 1, 


0 elsewhere; 



(1 _ 2 -!/ 3 ) 2 if 0 < 2 < 1, 
0 elsewhere; 


X2 

0 1 

Px 2 (x 2 ) 

7/12 5/12 


(ii) 1/3 if X 2 = 0, and 2/3 if = 1; (b) 


y 

-10 3 8 

Fy ( V ) 

1/6 1/2 5/6 1 


44. (a) 0.95; (b) (i) 


P[Y<y\X 1 =x 1 ] = l [1 _J y _ Xl)/2 


if y < xu 
if y > xu 


(ii) 


fy(y \x 1 =x 1 ) = < 2 


I 1 


l e -(if-*l)/2 if y > Xl) 


0 


elsewhere. 


46. (a) X - Poi(l); Y ~ Poi(l/2); P | {X = 35} ~ B(n = 35,p = 1/2); (b) (i) 0.5873; 
(ii) 0.5876; (c) 0.6970. 
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48. (a) 


fx(x) 


{» 


(4 — x) if 0 < x < 4, 
0 elsewhere; 


(b) (i) 4/3; (ii) 8/9; (iii) 0.32; (iv) 2.4; (c) —1/2; X and Y are not independent, because 
Px,y 0 . 

50. (a) 

15x 2 — 45x 4 + 30x 5 if 0 < x < 1, 

0 elsewhere; 


fx(x) = 

fy(y) = | 


30y 4 (l - y) if 0 < y < 1, 


(b) no, because, for instance, fx,y( 1? 1/2) = 22.5 ^ /x(1)/y(1/2) = 0; (c) 0.0191. 
52. (a) 

_ , x , . ——- if — 1 < x < l f —1 < y < x. 

fx,Y{%,y) = { 2(x + 1) 

0 elsewhere; 

5 [In 2 - ln(y + 1)] if -1 < y < 1 , 

0 elsewhere; 


/y(2/) = | 2 

(b) (i) -1/3; (ii) -1/2; (c) 1/6. 
54. (a) 


x 2 \xi 

-1 0 1 

-1 

0 

1 

1/64 6/64 1/64 
6/64 36/64 6/64 
1/64 6/64 1/64 


(b) -V2/2; ( c) no, because px,Y 7^ 0. 

Chapter 5 

2. r{x) = [x(l — lnx)] -1 , for 1 < x < e. We find that r'(x) > 0 in the interval [1, e]. 
Hence, the distribution is IFR. 

4. R(t) = (e _t — e -2t )/(2£), for t > 0. 

6 7i P -2 

U. igC . 

r. The distribution is IFR. 


8- rx(k) — jv-fe+i * 

10. (a) 2A/3; (b) A/2. 

12. 1/3. 

14. R(£) = 1 - (1 - e- Act ) {1 - [1 - (1 - e~ XAt f] e~ XBt }. 

16. ~ 0.3710. 

18. 0.996303 < R(to) < 0.997107 (approximately). The exact answer is 0.996327. 
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20. (a) p 3 (3 - 2 p); (b) e“ 3W (3 - 2 e~ et ), for t > 0. 

Chapter 6 

2. {iV*(i),£ > 0} is a Poisson process with rate A = 1. Thus, it is indeed a pure birth 
process. 

4. (a) 1/3; (b) 1/3; (c) 2/3. 

6. (a) 2.5; (b) 2.72; (c) 2.89. 

8. 1/16. 

10. 0.079. 

12. (l — e -2 ^) /(2/i). 

14. We define the states: 

0: the system is empty; 

no: there are n customers in the system and one is waiting 
to know whether the server can serve him or her; 
n : there are n customers in the system and one is being served. 


for n = 1, 2, 3. The balance equations are: 


state j departure rate from j = arrival rate to j 


0 

lo 

2 0 

3o 

1 

2 

3 


Xtt 0 = 
(X + MoWo = 
(A + Po)^2 0 = 

/A)7T3 0 = 
(A + /i)7Ti = 
(A + /i)7T 2 = 
UK 3 = 


V0PKl o + /i7Tl 

A7T 0 + P0PK2 o + PK 2 
A7Ti 0 + p 0 pK 3o + pw 3 

Xn 2o 

Mo(l ~P)k\ 0 
IJ,o(l-p)lT2 0 

Mo(l ~p)k 3o 


16. (a) ^/[4(/r + A)]; (b) 1 — {ji/[2(p + A)]}. 
18. 207/256. 

20. 0.77. 


Chapter 7 


2. Not a Gaussian process. 

4. Gaussian and weakly stationary. Hence, strictly stationary. 


X 

-4-20 2 4 

£ 

Px n {x ) 

1/8 1/4 1/4 1/4 1/8 

1 


8. (a) 4a?; (b) £?=„ Qa?. 



12. (a) Invertible; (b) not invertible. 
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14. E[Y n ] = <j 2 (l + el)- VAR[ig = 2<t 4 (1 + 20? + 0f). 
16. (a) ^ j X n \ (b) not similar. 

18. - 

20. ~ 0.527(0). 


E 


Answers to multiple choice questions 


Chapter 1 

1 b; 2 e; 3 b; 4 c; 5 a; 6 d; 7 b; 8 c; 9 e; 10 a. 

Chapter 2 

1 c,b,b,b,b,e; 2 c; 3 d; 4 d; 5 d; 6 c; 7 d; 8 e; 9 a; 10 c; 11 c; 12 b; 13 c; 14 a. 

Chapter 3 

1 d; 2 a; 3 d; 4 b; 5 c; 6 c; 7 a; 8 a; 9 e; 10 a; 11 b; 12 e; 13 a; 14 d; 15 a; 16 c; 17 c; 18 
e; 19 c; 20 c. 

Chapter 4 

1 e; 2 b; 3 c; 4 a; 5 a; 6 e; 7 c; 8 d; 9 b; 10 c; 11 d; 12 c; 13 c; 14 e; 15 d; 16 c; 17 b; 18 
e; 19 d; 20 c. 

Chapter 5 

1 d; 2 e; 3 c; 4 a; 5 b; 6 c; 7 c; 8 c; 9 d; 10 d. 

Chapter 6 

1 b; 2 d; 3 e; 4 b; 5 d; 6 a; 7 c; 8 d; 9 b; 10 c. 

Chapter 7 

1 c; 2 d; 3 d; 4 a; 5 b; 6 b; 7 e; 8 b; 9 c; 10 b. 



References 


1. Barnes, J. Wesley, Statistical Analysis for Engineers: A Computer-Based Approach , 
Prentice-Hall, Englewood Cliffs, NJ, 1988. 

2. Breiman, Leo, Probability and Stochastic Processes: With a View Toward Applications , 
Houghton Mifflin, Boston, 1969. 

3. Chung, Kai Lai, Elementary Probability Theory with Stochastic Processes , Springer-Verlag, 
New York, 1975. 

4. Dougherty, Edward R., Probability and Statistics for the Engineering, Computing, and 
Physical Sciences , Prentice-Hall, Englewood Cliffs, NJ, 1990. 

5. Feller, William, An Introduction to Probability Theory and Its Applications , Volume I, 3rd 
edition, Wiley, New York, 1968. 

6. Feller, William, An Introduction to Probability Theory and Its Applications , Volume 11, 2nd 
edition, Wiley, New York, 1971. 

7. Hastings, Kevin J., Probability and Statistics , Addison-Wesley, Reading, MA, 1997. 

8. Hines, William W. and Montgomery, Douglas C., Probability and Statistics in Engineering 
and Management Science , 3rd edition, Wiley, New York, 1990. 

9. Hogg, Robert V. and Craig, Allen T., Introduction to Mathematical Statistics , 3rd edition, 
Macmillan, New York, 1970. 

10. Hogg, Robert V. and Tanis, Elliot A., Probability and Statistical Inference , 6th Edition, 
Prentice Hall, Upper Saddle River, NJ, 2001. 

11. Kree, Paul, Introduction aux Mathematiques et a leurs Applications Fondamentales , Dunod, 
Paris, 1969. 

12. Lapin, Lawrence L., Probability and Statistics for Modern Engineering , 2nd edition, PWS- 
KENT, Boston, 1990. 

13. Leon-Garcia, Alberto, Probability and Random Processes for Electrical Engineering , 2nd 
edition, Addison-Wesley, Reading, MA, 1994. 

14. Lindgren, Bernard W., Statistical Theory , 3rd edition, Macmillan, New York, 1976. 

15. Maksoudian, Y. Leon, Probability and Statistics with Applications , International, Scranton, 
PA, 1969. 

16. Miller, Irwin, Freund, John E., and Johnson, Richard A., Probability and Statistics for 
Engineers , 4th edition, Prentice-Hall, Englewood Cliffs, NJ, 1990. 

17. Papoulis, Athanasios, Probability, Random Variables, and Stochastic Processes , 3rd edition, 
McGraw-Hill, New York, 1991. 



336 References 


18. Roberts, Richard A., An Introduction to Applied Probability , Addison-Wesley, Reading, 
MA, 1992. 

19. Ross, Sheldon M., Introduction to Probability and Statistics for Engineers and Scientists , 
Wiley, New York, 1987. 

20. Ross, Sheldon M., Introduction to Probability Models , 7th edition, Academic Press, San 
Diego, 2000. 

21. Spiegel, Murray R., Theory and Problems of Advanced Calculus , Schaum’s Outline Series, 
McGraw-Hill, New York, 1973. 

22. Walpole, Ronald E., Myers, Raymond H., Myers, Sharon L., and Ye, Keying, Probability 
and Statistics for Engineers and Scientists , 7th edition, Prentice Hall, Upper Saddle River, 
NJ, 2002. 


Index 


Absolute convergence, 17 
Absolutely integrable, 11 
Arrival rates, 194 
Asymmetric distribution, 90 
Autocorrelation function, 228, 229 
Autocovariance function, 228 
ARIMA (Autoregressive integrated moving 
average) process, 251 
ARMA (Autoregressive moving average) 
process, 249 

auto covariance function, 250 
Autoregressive process 
of order 1, 236, 237 
autocorrelation function, 237 
autocovariance function, 238 
variance, 238 
of order p, 239 
autocorrelation function, 241 
autocovariance function, 239 
variance, 243 
weakly stationary, 248 
Auxiliary variable, 127 
Average 

arrival rate, 198 
entering rate, 198 
failure rate, 169 

Backshift operator, 247 
Balance equations, 195, 195 


of a birth and death process, 196 

Bayes 5 

formula, 35 
rule, 35 
Bernoulli 
process, 266 
trials, 62 

Bienayme-Chebyshev inequality, 93 
Binomial approximation to the hypergeomet¬ 
ric distribution, 66 
Binomial series, 17 
Birth and death process, 193 
limiting probabilities, 196 
Birth rates, 194 
Bridge system, 177 

Cauchy principal value, 8 
Central limit theorem, 135 
Central moments, 88 
Chain rule, 5 

Characteristic function, 11, 131 
Combination, 38 
Communicating states, 193 
Conditioning, 121 
Continuity correction, 73 
Continuous function, 2 
Continuous-time Markov chain, 192 
Convolution, 22, 128, 130 
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Correlation coefficient, 133 
Countably infinite, 27 
Covariance, 132 
function, 228 
matrix, 232 
Cut vector, 180 
minimal, 180 

D’Alembert’s ratio test, 16 
De Moivre-Laplace approximation, 72 
Death rates, 194 

Decreasing failure rate (DFR), 165 
Density function, 57 
conditional, 120 
joint, 118 
marginal, 119 
Departure rates, 194 
Derivative, 3 
Difference operator, 247 
Differentiable function, 3 
Dirac delta function, 4 
Discrete-time Markov chain, 230 
Distribution 
Bernoulli, 62 
beta, 79 
generalized, 79 
binomial, 61 
binormal, 233 
bivariate normal, 233 
Cauchy, 8 
Erlang, 75 
exponential, 76 
double, 77 
shifted, 168 
gamma, 74 
Gaussian, 70 
geometric, 64 
hypergeometric, 66 
Laplace, 77 
lognormal, 80 
multinormal, 231 
negative binomial, 65 
normal, 70 


standard, 71 
truncated, 166 
Pascal, 65 
Poisson, 68 
uniform, 79 
Weibull, 77 

Distribution function, 56 
joint, 115, 118 

Elementary outcome, 27 
Erlang loss system, 219 
Erlang’s formula, 219 
Error function, 235 
Event (s), 28 
compound, 28 
equally likely, 30 
equiprobable, 30 
incompatible, 28 
independent, 33, 33 
mutually exclusive, 28 
simple, 28 
Expected value, 84 

of a function of a random variable, 83 
of a function of a random vector, 131 
Extreme value distribution, 77, 184 

Failure rate function, 163 
Fourier transform, 11 
Function of a random variable, 81 
Fundamental theorem of calculus, 8 

Gamma function, 74 
Gaussian 

stochastic process, 232 
white noise (GWN), 231 
Generating function, 24 
Geometric series, 15, 16 

Hazard rate function, 163 
Heaviside function, 2 

IID noise, 230 

Improper integral, 7 

Increasing failure rate (IFR), 165 
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Independence, 33 

of normal random variables, 133 
Independent increments, 69 
Infinite product, 26 
Integral, 7 
Integrand, 7 
Integration 
by parts, 10 
by substitution, 9 
Interval failure rate, 168 
Interval of convergence, 16 
Inverse Fourier transform, 11 
Irreducible Markov chain, 193 

Jump discontinuity, 58 

/c-out-of-n system, 176 
Kurtosis coefficient, 91 

Lag operator, 247 
Laplace transform, 12 
Law of large numbers, 135 
Left-skewed distribution, 90 
L’Hospital’s rule, 5 
Limit, 1 

Limiting probabilities, 195 
Linear combination, 131, 231 
of normal random variables, 131 
Linearly deterministic process, 247 
Little’s formula, 199 
Loss system, 218 


M/M/1 queueing model, 199 
limiting probabilities, 201 
M/M/1 queueing model with finite capacity, 
207 

limiting probabilities, 208 
M/M/s queueing model, 212 
limiting probabilities, 214 
M/M/s/c queueing model, 218 
M/M/s/s queueing model, 218 
limiting probabilities, 218 
M/M /oo queueing model, 215 
limiting probabilities, 215 
Mode, 86 

Moment-generating function, 12 
Moments about 
the mean, 88 
the origin, 88 
the point a, 88 

Moving average process of order q , 244 
autocorrelation function, 245 
covariance, 245 
invertible, 248 
Multiplication 
principle, 38 
rule, 32 

Noncentral moments, 88 
Normal approximation to the binomial 
distribution, 73 

Ornstein-Uhlenbeck process, 263 


Markov(ian) process, 192 
Mathematical expectation, 84 
conditional, 121 
Mean time 

between failures (MTBF), 162 
to failure (MTTF), 162 
to repair (MTTR), 162 
Median, 84 

Memoryless property, 65, 76 
Minimal 
cut set, 180 
path set, 180 


Partial 

autocorrelation function, 251 
derivative, 6 
sum of a series, 15 
Partition, 7, 35 
Path vector, 180 
minimal, 180 
Percentile, 86 
Permutation, 38 
Piecewise continuous, 2 
Poisson approximation to the binomial 
distribution, 68 
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Poisson process, 69 
Power series, 16 
Predictor, 255 
best linear, 256 
optimal, 255 
Probability, 29 
conditional, 32 

Probability (mass) function, 56 
conditional, 116 
joint, 115 
marginal, 116 
Pure 

birth process, 193 
death process, 193 

Purely nondeterministic process, 247 
Quantile, 86 

Radius of convergence, 16 
Random 

experiment, 27 
process, 191 
variable(s), 55 
continuous type, 57 
discrete type, 55 
i.i.d., 135 

independent, 116, 120 
mixed type, 83 
vector, 115 
continuous, 118 
discrete, 115 
Range, 86 

Relation between the gamma and 
Poisson distributions, 75 
Relative frequency, 29 
Reliability function, 161 
Right-skewed distribution, 90 

Sample space, 27 
continuous, 27 
discrete, 27 
Sampling fraction, 67 
Sequence, 14 
Series, 15 


Shifted exponential distribution, 168 
Skewness coefficient, 90 
Standard 

Brownian motion, 235, 263 
deviation, 87 

State 

space, 191 

transition diagram, 200 
variable, 227 
Stationary 

increments, 69 
stochastic process, 227 
Stochastic process, 191 
continuous-time, 191 
discrete-time, 192 
second-order stationary, 228 
weakly stationary, 228 
Structure function, 178 
monotonic, 179 
Sum of 

exponential random variables, 130 
Poisson random variables, 130 
Survival function, 161 

Time series, 227 
Time-homogeneous 
Markov chain, 230 
transition probabilities, 192 
Total probability rule, 35 
Traffic intensity, 200 

Transition function of a Markov chain, 193 
Tree diagram, 36 

Truncated normal distribution, 166 
Utilization rate, 200 
Variance, 87 

of a linear combination, 132 
Venn diagram, 28 

Wold’s decomposition theorem, 246 

Yule process, 220 

Yule-Walker equations, 241 


